kernel/sched/fair.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
We encountered an issue where the kernel thread `ksmd` runs on the PMD
dedicated isolated core, leading to high latency in OVS packets.
Upon analysis, we discovered that this is caused by the current
select_idle_smt() function not taking the sched_domain mask into account.
Kernel version: linux-4.19.y
Signed-off-by: wujing <realwujing@qq.com>
Signed-off-by: QiLiang Yuan <yuanql9@chinatelecom.cn>
---
kernel/sched/fair.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 09f82c84474b..0950cabfc1d0 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6171,7 +6171,8 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
return -1;
for_each_cpu(cpu, cpu_smt_mask(target)) {
- if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
+ if (!cpumask_test_cpu(cpu, &p->cpus_allowed) ||
+ !cpumask_test_cpu(cpu, sched_domain_span(sd)))
continue;
if (available_idle_cpu(cpu))
return cpu;
--
2.45.2
it has already been addressed by the following patches in tip/sched/core:
8aeaffef8c6e ("sched/fair: Take the scheduling domain into account in
select_idle_smt()")
23d04d8c6b8e ("sched/fair: Take the scheduling domain into account in
select_idle_core()")
在 2024/7/30 15:10, wujing 写道:
> We encountered an issue where the kernel thread `ksmd` runs on the PMD
> dedicated isolated core, leading to high latency in OVS packets.
>
> Upon analysis, we discovered that this is caused by the current
> select_idle_smt() function not taking the sched_domain mask into account.
>
> Kernel version: linux-4.19.y
>
> Signed-off-by: wujing <realwujing@qq.com>
> Signed-off-by: QiLiang Yuan <yuanql9@chinatelecom.cn>
> ---
> kernel/sched/fair.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 09f82c84474b..0950cabfc1d0 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6171,7 +6171,8 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
> return -1;
>
> for_each_cpu(cpu, cpu_smt_mask(target)) {
> - if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
> + if (!cpumask_test_cpu(cpu, &p->cpus_allowed) ||
> + !cpumask_test_cpu(cpu, sched_domain_span(sd)))
> continue;
> if (available_idle_cpu(cpu))
> return cpu;
On Tue, Jul 30, 2024 at 03:10:50PM +0800, wujing wrote:
> We encountered an issue where the kernel thread `ksmd` runs on the PMD
> dedicated isolated core, leading to high latency in OVS packets.
>
> Upon analysis, we discovered that this is caused by the current
> select_idle_smt() function not taking the sched_domain mask into account.
>
> Kernel version: linux-4.19.y
If you're trying to backport something, I think you forgot to Cc stable
and provide the proper upstream commit.
As is this isn't something I can do anything with. The patch does not
apply to any recent kernel and AFAICT this issue has long since been
fixed.
>
> Signed-off-by: wujing <realwujing@qq.com>
> Signed-off-by: QiLiang Yuan <yuanql9@chinatelecom.cn>
> ---
> kernel/sched/fair.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 09f82c84474b..0950cabfc1d0 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6171,7 +6171,8 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
> return -1;
>
> for_each_cpu(cpu, cpu_smt_mask(target)) {
> - if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
> + if (!cpumask_test_cpu(cpu, &p->cpus_allowed) ||
> + !cpumask_test_cpu(cpu, sched_domain_span(sd)))
> continue;
> if (available_idle_cpu(cpu))
> return cpu;
> --
> 2.45.2
>
> If you're trying to backport something, I think you forgot to Cc stable > and provide the proper upstream commit. > > As is this isn't something I can do anything with. The patch does not > apply to any recent kernel and AFAICT this issue has long since been > fixed. When fixing this bug, I didn't pay much attention to upstream changes. Upon reviewing the history of relevant commits, I found that they have been merged and reverted multiple times: ```bash git log -S 'cpumask_test_cpu(cpu, sched_domain_span(sd))' --oneline \ kernel/sched/fair.c 8aeaffef8c6e sched/fair: Take the scheduling domain into account in select_idle_smt() 3e6efe87cd5c sched/fair: Remove redundant check in select_idle_smt() 3e8c6c9aac42 sched/fair: Remove task_util from effective utilization in feec() c722f35b513f sched/fair: Bring back select_idle_smt(), but differently 6cd56ef1df39 sched/fair: Remove select_idle_smt() df3cb4ea1fb6 sched/fair: Fix wrong cpu selecting from isolated domain ``` The latest upstream commit 8aeaffef8c6e is not applicable to linux-4.19.y. The current patch has been tested on linux-4.19.y and I am looking forward to its inclusion in the stable version.
On Tue, Jul 30, 2024 at 05:03:38PM +0800, wujing wrote: > > If you're trying to backport something, I think you forgot to Cc stable > > and provide the proper upstream commit. > > > > As is this isn't something I can do anything with. The patch does not > > apply to any recent kernel and AFAICT this issue has long since been > > fixed. > > When fixing this bug, I didn't pay much attention to upstream changes. > Upon reviewing the history of relevant commits, I found that they have > been merged and reverted multiple times: > > ```bash > git log -S 'cpumask_test_cpu(cpu, sched_domain_span(sd))' --oneline \ > kernel/sched/fair.c > > 8aeaffef8c6e sched/fair: Take the scheduling domain into account in select_idle_smt() > 3e6efe87cd5c sched/fair: Remove redundant check in select_idle_smt() > 3e8c6c9aac42 sched/fair: Remove task_util from effective utilization in feec() > c722f35b513f sched/fair: Bring back select_idle_smt(), but differently > 6cd56ef1df39 sched/fair: Remove select_idle_smt() > df3cb4ea1fb6 sched/fair: Fix wrong cpu selecting from isolated domain > ``` > > The latest upstream commit 8aeaffef8c6e is not applicable to linux-4.19.y. > The current patch has been tested on linux-4.19.y and I am looking forward > to its inclusion in the stable version. What "current patch"? confused, greg k-h
> What "current patch"? > > confused, > > greg k-h The current patch is in my first email. Please ignore the previous two emails. The "current patch" mentioned in the earlier emails refers to the upstream status, but the latest upstream patch can no longer be applied to linux-4.19.y.
On Tue, Jul 30, 2024 at 05:40:17PM +0800, wujing wrote:
> > What "current patch"?
> >
> > confused,
> >
> > greg k-h
>
> The current patch is in my first email.
What message exactly? I don't see any such message on the stable list.
> Please ignore the previous two emails.
> The "current patch" mentioned in the earlier emails refers to the upstream
> status, but the latest upstream patch can no longer be applied to linux-4.19.y.
Again, please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.
thanks,
greg k-h
> On Tue, Jul 30, 2024 at 05:40:17PM +0800, wujing wrote:
> > > What "current patch"?
> > >
> > > confused,
> > >
> > > greg k-h
> >
> > The current patch is in my first email.
>
> What message exactly? I don't see any such message on the stable list.
>
> > Please ignore the previous two emails.
> > The "current patch" mentioned in the earlier emails refers to the upstream
> > status, but the latest upstream patch can no longer be applied to linux-4.19.y.
>
> Again, please read:
> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
>
> thanks,
>
> greg k-h
The email you just replied to is correct.
I reviewed the link in the email, and according to the link,
the patch I submitted meets the third criterion. I have noted
Upstream commit <8aeaffef8c6e> in the patch log.
From 9d4ecc9314088c2b0aa39c2248fb5e64042f1eef Mon Sep 17 00:00:00 2001
From: wujing <realwujing@gmail.com>
Date: Tue, 30 Jul 2024 15:35:53 +0800
Subject: [PATCH] sched/fair: Correct CPU selection from isolated domain
We encountered an issue where the kernel thread `ksmd` runs on the PMD
dedicated isolated core, leading to high latency in OVS packets.
Upon analysis, we discovered that this is caused by the current
select_idle_smt() function not taking the sched_domain mask into account.
Upstream commit <8aeaffef8c6e>
Kernel version: linux-4.19.y
Signed-off-by: wujing <realwujing@qq.com>
Signed-off-by: QiLiang Yuan <yuanql9@chinatelecom.cn>
---
kernel/sched/fair.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 09f82c84474b..0950cabfc1d0 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6171,7 +6171,8 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
return -1;
for_each_cpu(cpu, cpu_smt_mask(target)) {
- if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
+ if (!cpumask_test_cpu(cpu, &p->cpus_allowed) ||
+ !cpumask_test_cpu(cpu, sched_domain_span(sd)))
continue;
if (available_idle_cpu(cpu))
return cpu;
--
2.45.2
On Tue, Jul 30, 2024 at 06:11:06PM +0800, wujing wrote: > > On Tue, Jul 30, 2024 at 05:40:17PM +0800, wujing wrote: > > > > What "current patch"? > > > > > > > > confused, > > > > > > > > greg k-h > > > > > > The current patch is in my first email. > > > > What message exactly? I don't see any such message on the stable list. > > > > > Please ignore the previous two emails. > > > The "current patch" mentioned in the earlier emails refers to the upstream > > > status, but the latest upstream patch can no longer be applied to linux-4.19.y. > > > > Again, please read: > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html > > for how to do this properly. > > > > thanks, > > > > greg k-h > > The email you just replied to is correct. > > I reviewed the link in the email, and according to the link, > the patch I submitted meets the third criterion. I have noted > Upstream commit <8aeaffef8c6e> in the patch log. > > > > >From 9d4ecc9314088c2b0aa39c2248fb5e64042f1eef Mon Sep 17 00:00:00 2001 > From: wujing <realwujing@gmail.com> > Date: Tue, 30 Jul 2024 15:35:53 +0800 > Subject: [PATCH] sched/fair: Correct CPU selection from isolated domain > > We encountered an issue where the kernel thread `ksmd` runs on the PMD > dedicated isolated core, leading to high latency in OVS packets. > > Upon analysis, we discovered that this is caused by the current > select_idle_smt() function not taking the sched_domain mask into account. > > Upstream commit <8aeaffef8c6e> > > Kernel version: linux-4.19.y This is not in a format that I can take. Also, this does not match that commit id, so you need to document it really really well why this is different. And you lost the original authorship information, and the reviews. And finally, we can't take a change for 4.19.y that is also not in newer kernel releases, because if we did that, and you upgraded, you would have a regression. But most importantly, why are you still doing stuff on 4.19.y? This kernel is going to go end-of-life very soon, and you should already have all of your systems that rely on it off of it by now, or planned to within a few months. To not do that is to ensure that you will end up with insecure systems when the end-of-life deadline happens. thanks, greg k-h
> What "current patch"? > > confused, > > greg k-h ```bash git remote -v stable git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git (fetch) stable git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git (push) ``` ```bash git branch *master ``` In the git repository information provided above, 8aeaffef8c6e is one of the output items from the command `git log -S 'cpumask_test_cpu(cpu, \ sched_domain_span(sd))' --oneline kernel/sched/fair.c`.
On Tue, Jul 30, 2024 at 05:30:40PM +0800, wujing wrote:
> > What "current patch"?
> >
> > confused,
> >
> > greg k-h
>
> ```bash
> git remote -v
> stable git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git (fetch)
> stable git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git (push)
> ```
>
> ```bash
> git branch
> *master
> ```
>
> In the git repository information provided above, 8aeaffef8c6e is one of the
> output items from the command `git log -S 'cpumask_test_cpu(cpu, \
> sched_domain_span(sd))' --oneline kernel/sched/fair.c`.
>
Ok, but I do not understand what you are asking me to do here. You have
read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to ask for a commit to be applied to a stable tree, right?
thanks,
greg k-h
> What "current patch"? > > confused, > > greg k-h ```bash git remote -v stable git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git (fetch) stable git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git (push) ``` ```bash git branch *master ``` In the git repository information provided above, 8aeaffef8c6e is one of the output items from the command `git log -S 'cpumask_test_cpu(cpu, \ sched_domain_span(sd))' --oneline kernel/sched/fair.c`.
© 2016 - 2025 Red Hat, Inc.