Make wait_task_inactive()'s @match_state work like ttwu()'s @state.
That is, instead of an equal comparison, use it as a mask. This allows
matching multiple block conditions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/sched/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3295,7 +3295,7 @@ unsigned long wait_task_inactive(struct
* is actually now running somewhere else!
*/
while (task_running(rq, p)) {
- if (match_state && unlikely(READ_ONCE(p->__state) != match_state))
+ if (match_state && !(READ_ONCE(p->__state) & match_state))
return 0;
cpu_relax();
}
@@ -3310,7 +3310,7 @@ unsigned long wait_task_inactive(struct
running = task_running(rq, p);
queued = task_on_rq_queued(p);
ncsw = 0;
- if (!match_state || READ_ONCE(p->__state) == match_state)
+ if (!match_state || (READ_ONCE(p->__state) & match_state))
ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
task_rq_unlock(rq, p, &rf);
* Peter Zijlstra <peterz@infradead.org> wrote:
> Make wait_task_inactive()'s @match_state work like ttwu()'s @state.
>
> That is, instead of an equal comparison, use it as a mask. This allows
> matching multiple block conditions.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> kernel/sched/core.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3295,7 +3295,7 @@ unsigned long wait_task_inactive(struct
> * is actually now running somewhere else!
> */
> while (task_running(rq, p)) {
> - if (match_state && unlikely(READ_ONCE(p->__state) != match_state))
> + if (match_state && !(READ_ONCE(p->__state) & match_state))
> return 0;
We lose the unlikely annotation there - but I guess it probably never
really mattered anyway?
Suggestion #1:
- Shouldn't we rename task_running() to something like task_on_cpu()? The
task_running() primitive is similar to TASK_RUNNING but is not based off
any TASK_FLAGS.
Suggestion #2:
- Shouldn't we eventually standardize on task->on_cpu on UP kernels too?
They don't really matter anymore, and doing so removes #ifdefs and makes
the code easier to read.
> cpu_relax();
> }
> @@ -3310,7 +3310,7 @@ unsigned long wait_task_inactive(struct
> running = task_running(rq, p);
> queued = task_on_rq_queued(p);
> ncsw = 0;
> - if (!match_state || READ_ONCE(p->__state) == match_state)
> + if (!match_state || (READ_ONCE(p->__state) & match_state))
> ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
> task_rq_unlock(rq, p, &rf);
Suggestion #3:
- Couldn't the following users with a 0 mask:
drivers/powercap/idle_inject.c: wait_task_inactive(iit->tsk, 0);
fs/coredump.c: wait_task_inactive(ptr->task, 0);
Use ~0 instead (exposed as TASK_ANY or so) and then we can drop the
!match_state special case?
They'd do something like:
drivers/powercap/idle_inject.c: wait_task_inactive(iit->tsk, TASK_ANY);
fs/coredump.c: wait_task_inactive(ptr->task, TASK_ANY);
It's not an entirely 100% equivalent transformation though, but looks OK
at first sight: ->__state will be some nonzero mask for genuine tasks
waiting to schedule out, so any match will be functionally the same as a
0 flag telling us not to check any of the bits, right? I might be missing
something though.
Thanks,
Ingo
On Sun, Sep 04, 2022 at 12:44:36PM +0200, Ingo Molnar wrote:
>
> * Peter Zijlstra <peterz@infradead.org> wrote:
>
> > Make wait_task_inactive()'s @match_state work like ttwu()'s @state.
> >
> > That is, instead of an equal comparison, use it as a mask. This allows
> > matching multiple block conditions.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> > kernel/sched/core.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3295,7 +3295,7 @@ unsigned long wait_task_inactive(struct
> > * is actually now running somewhere else!
> > */
> > while (task_running(rq, p)) {
> > - if (match_state && unlikely(READ_ONCE(p->__state) != match_state))
> > + if (match_state && !(READ_ONCE(p->__state) & match_state))
> > return 0;
>
> We lose the unlikely annotation there - but I guess it probably never
> really mattered anyway?
So any wait_task_inactive() caller does want that case to be true, but
the whole match_state precondition mostly wrecks things anyway. If
anything it should've been:
if (likely(match_state && !(READ_ONCE(p->__state) & match_state)))
return 0;
but I can't find it in me to care too much here.
> Suggestion #1:
>
> - Shouldn't we rename task_running() to something like task_on_cpu()? The
> task_running() primitive is similar to TASK_RUNNING but is not based off
> any TASK_FLAGS.
That looks like a simple enough patch, lemme go do that.
> Suggestion #2:
>
> - Shouldn't we eventually standardize on task->on_cpu on UP kernels too?
> They don't really matter anymore, and doing so removes #ifdefs and makes
> the code easier to read.
Probably, but that sounds like something that'll spiral out of control
real quick, so I'll leave that on the TODO list somewhere.
> > cpu_relax();
> > }
> > @@ -3310,7 +3310,7 @@ unsigned long wait_task_inactive(struct
> > running = task_running(rq, p);
> > queued = task_on_rq_queued(p);
> > ncsw = 0;
> > - if (!match_state || READ_ONCE(p->__state) == match_state)
> > + if (!match_state || (READ_ONCE(p->__state) & match_state))
> > ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
> > task_rq_unlock(rq, p, &rf);
>
> Suggestion #3:
>
> - Couldn't the following users with a 0 mask:
>
> drivers/powercap/idle_inject.c: wait_task_inactive(iit->tsk, 0);
> fs/coredump.c: wait_task_inactive(ptr->task, 0);
>
> Use ~0 instead (exposed as TASK_ANY or so) and then we can drop the
> !match_state special case?
>
> They'd do something like:
>
> drivers/powercap/idle_inject.c: wait_task_inactive(iit->tsk, TASK_ANY);
> fs/coredump.c: wait_task_inactive(ptr->task, TASK_ANY);
>
> It's not an entirely 100% equivalent transformation though, but looks OK
> at first sight: ->__state will be some nonzero mask for genuine tasks
> waiting to schedule out, so any match will be functionally the same as a
> 0 flag telling us not to check any of the bits, right? I might be missing
> something though.
I too am thinking that should work. Added patch for that.
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Sun, Sep 04, 2022 at 12:44:36PM +0200, Ingo Molnar wrote:
> >
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > > Make wait_task_inactive()'s @match_state work like ttwu()'s @state.
> > >
> > > That is, instead of an equal comparison, use it as a mask. This allows
> > > matching multiple block conditions.
> > >
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > ---
> > > kernel/sched/core.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -3295,7 +3295,7 @@ unsigned long wait_task_inactive(struct
> > > * is actually now running somewhere else!
> > > */
> > > while (task_running(rq, p)) {
> > > - if (match_state && unlikely(READ_ONCE(p->__state) != match_state))
> > > + if (match_state && !(READ_ONCE(p->__state) & match_state))
> > > return 0;
> >
> > We lose the unlikely annotation there - but I guess it probably never
> > really mattered anyway?
>
> So any wait_task_inactive() caller does want that case to be true, but
> the whole match_state precondition mostly wrecks things anyway. If
> anything it should've been:
>
> if (likely(match_state && !(READ_ONCE(p->__state) & match_state)))
> return 0;
>
> but I can't find it in me to care too much here.
Yeah, I agree that this is probably the most likely branch - and default
compiler code generation behavior should be pretty close to that to begin
with.
Ie. ack on dropping the unlikely() annotation. :-)
Might make sense to add a sentence to the changelog though, in case anyone
(like me) is wondering about whether the dropped annotation was intended.
Thanks,
Ingo
On Tue, Sep 06, 2022 at 12:54:34PM +0200, Peter Zijlstra wrote:
> > Suggestion #1:
> >
> > - Shouldn't we rename task_running() to something like task_on_cpu()? The
> > task_running() primitive is similar to TASK_RUNNING but is not based off
> > any TASK_FLAGS.
>
> That looks like a simple enough patch, lemme go do that.
---
Subject: sched: Rename task_running() to task_on_cpu()
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue Sep 6 12:33:04 CEST 2022
There is some ambiguity about task_running() in that it is unrelated
to TASK_RUNNING but instead tests ->on_cpu. As such, rename the thing
task_on_cpu().
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/sched/core.c | 10 +++++-----
kernel/sched/core_sched.c | 2 +-
kernel/sched/deadline.c | 6 +++---
kernel/sched/fair.c | 2 +-
kernel/sched/rt.c | 6 +++---
kernel/sched/sched.h | 2 +-
6 files changed, 14 insertions(+), 14 deletions(-)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2778,7 +2778,7 @@ static int affine_move_task(struct rq *r
return -EINVAL;
}
- if (task_running(rq, p) || READ_ONCE(p->__state) == TASK_WAKING) {
+ if (task_on_cpu(rq, p) || READ_ONCE(p->__state) == TASK_WAKING) {
/*
* MIGRATE_ENABLE gets here because 'p == current', but for
* anything else we cannot do is_migration_disabled(), punt
@@ -3290,11 +3290,11 @@ unsigned long wait_task_inactive(struct
*
* NOTE! Since we don't hold any locks, it's not
* even sure that "rq" stays as the right runqueue!
- * But we don't care, since "task_running()" will
+ * But we don't care, since "task_on_cpu()" will
* return false if the runqueue has changed and p
* is actually now running somewhere else!
*/
- while (task_running(rq, p)) {
+ while (task_on_cpu(rq, p)) {
if (match_state && unlikely(READ_ONCE(p->__state) != match_state))
return 0;
cpu_relax();
@@ -3307,7 +3307,7 @@ unsigned long wait_task_inactive(struct
*/
rq = task_rq_lock(p, &rf);
trace_sched_wait_task(p);
- running = task_running(rq, p);
+ running = task_on_cpu(rq, p);
queued = task_on_rq_queued(p);
ncsw = 0;
if (!match_state || READ_ONCE(p->__state) == match_state)
@@ -8649,7 +8649,7 @@ int __sched yield_to(struct task_struct
if (curr->sched_class != p->sched_class)
goto out_unlock;
- if (task_running(p_rq, p) || !task_is_running(p))
+ if (task_on_cpu(p_rq, p) || !task_is_running(p))
goto out_unlock;
yielded = curr->sched_class->yield_to_task(rq, p);
--- a/kernel/sched/core_sched.c
+++ b/kernel/sched/core_sched.c
@@ -88,7 +88,7 @@ static unsigned long sched_core_update_c
* core has now entered/left forced idle state. Defer accounting to the
* next scheduling edge, rather than always forcing a reschedule here.
*/
- if (task_running(rq, p))
+ if (task_on_cpu(rq, p))
resched_curr(rq);
task_rq_unlock(rq, p, &rf);
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2087,7 +2087,7 @@ static void task_fork_dl(struct task_str
static int pick_dl_task(struct rq *rq, struct task_struct *p, int cpu)
{
- if (!task_running(rq, p) &&
+ if (!task_on_cpu(rq, p) &&
cpumask_test_cpu(cpu, &p->cpus_mask))
return 1;
return 0;
@@ -2241,7 +2241,7 @@ static struct rq *find_lock_later_rq(str
if (double_lock_balance(rq, later_rq)) {
if (unlikely(task_rq(task) != rq ||
!cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) ||
- task_running(rq, task) ||
+ task_on_cpu(rq, task) ||
!dl_task(task) ||
!task_on_rq_queued(task))) {
double_unlock_balance(rq, later_rq);
@@ -2475,7 +2475,7 @@ static void pull_dl_task(struct rq *this
*/
static void task_woken_dl(struct rq *rq, struct task_struct *p)
{
- if (!task_running(rq, p) &&
+ if (!task_on_cpu(rq, p) &&
!test_tsk_need_resched(rq->curr) &&
p->nr_cpus_allowed > 1 &&
dl_task(rq->curr) &&
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7938,7 +7938,7 @@ int can_migrate_task(struct task_struct
/* Record that we found at least one task that could run on dst_cpu */
env->flags &= ~LBF_ALL_PINNED;
- if (task_running(env->src_rq, p)) {
+ if (task_on_cpu(env->src_rq, p)) {
schedstat_inc(p->stats.nr_failed_migrations_running);
return 0;
}
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1849,7 +1849,7 @@ static void put_prev_task_rt(struct rq *
static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu)
{
- if (!task_running(rq, p) &&
+ if (!task_on_cpu(rq, p) &&
cpumask_test_cpu(cpu, &p->cpus_mask))
return 1;
@@ -2004,7 +2004,7 @@ static struct rq *find_lock_lowest_rq(st
*/
if (unlikely(task_rq(task) != rq ||
!cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) ||
- task_running(rq, task) ||
+ task_on_cpu(rq, task) ||
!rt_task(task) ||
!task_on_rq_queued(task))) {
@@ -2462,7 +2462,7 @@ static void pull_rt_task(struct rq *this
*/
static void task_woken_rt(struct rq *rq, struct task_struct *p)
{
- bool need_to_push = !task_running(rq, p) &&
+ bool need_to_push = !task_on_cpu(rq, p) &&
!test_tsk_need_resched(rq->curr) &&
p->nr_cpus_allowed > 1 &&
(dl_task(rq->curr) || rt_task(rq->curr)) &&
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2060,7 +2060,7 @@ static inline int task_current(struct rq
return rq->curr == p;
}
-static inline int task_running(struct rq *rq, struct task_struct *p)
+static inline int task_on_cpu(struct rq *rq, struct task_struct *p)
{
#ifdef CONFIG_SMP
return p->on_cpu;
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 0b9d46fc5ef7a457cc635b30b010081228cb81ac
Gitweb: https://git.kernel.org/tip/0b9d46fc5ef7a457cc635b30b010081228cb81ac
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Tue, 06 Sep 2022 12:33:04 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 07 Sep 2022 21:53:47 +02:00
sched: Rename task_running() to task_on_cpu()
There is some ambiguity about task_running() in that it is unrelated
to TASK_RUNNING but instead tests ->on_cpu. As such, rename the thing
task_on_cpu().
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yxhkhn55uHZx+NGl@hirez.programming.kicks-ass.net
---
kernel/sched/core.c | 10 +++++-----
kernel/sched/core_sched.c | 2 +-
kernel/sched/deadline.c | 6 +++---
kernel/sched/fair.c | 2 +-
kernel/sched/rt.c | 6 +++---
kernel/sched/sched.h | 2 +-
6 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7d289d8..1630181 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2777,7 +2777,7 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
return -EINVAL;
}
- if (task_running(rq, p) || READ_ONCE(p->__state) == TASK_WAKING) {
+ if (task_on_cpu(rq, p) || READ_ONCE(p->__state) == TASK_WAKING) {
/*
* MIGRATE_ENABLE gets here because 'p == current', but for
* anything else we cannot do is_migration_disabled(), punt
@@ -3289,11 +3289,11 @@ unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state
*
* NOTE! Since we don't hold any locks, it's not
* even sure that "rq" stays as the right runqueue!
- * But we don't care, since "task_running()" will
+ * But we don't care, since "task_on_cpu()" will
* return false if the runqueue has changed and p
* is actually now running somewhere else!
*/
- while (task_running(rq, p)) {
+ while (task_on_cpu(rq, p)) {
if (match_state && unlikely(READ_ONCE(p->__state) != match_state))
return 0;
cpu_relax();
@@ -3306,7 +3306,7 @@ unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state
*/
rq = task_rq_lock(p, &rf);
trace_sched_wait_task(p);
- running = task_running(rq, p);
+ running = task_on_cpu(rq, p);
queued = task_on_rq_queued(p);
ncsw = 0;
if (!match_state || READ_ONCE(p->__state) == match_state)
@@ -8648,7 +8648,7 @@ again:
if (curr->sched_class != p->sched_class)
goto out_unlock;
- if (task_running(p_rq, p) || !task_is_running(p))
+ if (task_on_cpu(p_rq, p) || !task_is_running(p))
goto out_unlock;
yielded = curr->sched_class->yield_to_task(rq, p);
diff --git a/kernel/sched/core_sched.c b/kernel/sched/core_sched.c
index 1ec807f..a57fd8f 100644
--- a/kernel/sched/core_sched.c
+++ b/kernel/sched/core_sched.c
@@ -88,7 +88,7 @@ static unsigned long sched_core_update_cookie(struct task_struct *p,
* core has now entered/left forced idle state. Defer accounting to the
* next scheduling edge, rather than always forcing a reschedule here.
*/
- if (task_running(rq, p))
+ if (task_on_cpu(rq, p))
resched_curr(rq);
task_rq_unlock(rq, p, &rf);
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index d0fe6a2..86dea6a 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2092,7 +2092,7 @@ static void task_fork_dl(struct task_struct *p)
static int pick_dl_task(struct rq *rq, struct task_struct *p, int cpu)
{
- if (!task_running(rq, p) &&
+ if (!task_on_cpu(rq, p) &&
cpumask_test_cpu(cpu, &p->cpus_mask))
return 1;
return 0;
@@ -2244,7 +2244,7 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
if (double_lock_balance(rq, later_rq)) {
if (unlikely(task_rq(task) != rq ||
!cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) ||
- task_running(rq, task) ||
+ task_on_cpu(rq, task) ||
!dl_task(task) ||
!task_on_rq_queued(task))) {
double_unlock_balance(rq, later_rq);
@@ -2474,7 +2474,7 @@ skip:
*/
static void task_woken_dl(struct rq *rq, struct task_struct *p)
{
- if (!task_running(rq, p) &&
+ if (!task_on_cpu(rq, p) &&
!test_tsk_need_resched(rq->curr) &&
p->nr_cpus_allowed > 1 &&
dl_task(rq->curr) &&
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7bad641..4e5b171 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7935,7 +7935,7 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
/* Record that we found at least one task that could run on dst_cpu */
env->flags &= ~LBF_ALL_PINNED;
- if (task_running(env->src_rq, p)) {
+ if (task_on_cpu(env->src_rq, p)) {
schedstat_inc(p->stats.nr_failed_migrations_running);
return 0;
}
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 27e694c..d869bcf 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1845,7 +1845,7 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p)
static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu)
{
- if (!task_running(rq, p) &&
+ if (!task_on_cpu(rq, p) &&
cpumask_test_cpu(cpu, &p->cpus_mask))
return 1;
@@ -2000,7 +2000,7 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq)
*/
if (unlikely(task_rq(task) != rq ||
!cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) ||
- task_running(rq, task) ||
+ task_on_cpu(rq, task) ||
!rt_task(task) ||
!task_on_rq_queued(task))) {
@@ -2458,7 +2458,7 @@ skip:
*/
static void task_woken_rt(struct rq *rq, struct task_struct *p)
{
- bool need_to_push = !task_running(rq, p) &&
+ bool need_to_push = !task_on_cpu(rq, p) &&
!test_tsk_need_resched(rq->curr) &&
p->nr_cpus_allowed > 1 &&
(dl_task(rq->curr) || rt_task(rq->curr)) &&
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b75ac74..91b2c7e 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2051,7 +2051,7 @@ static inline int task_current(struct rq *rq, struct task_struct *p)
return rq->curr == p;
}
-static inline int task_running(struct rq *rq, struct task_struct *p)
+static inline int task_on_cpu(struct rq *rq, struct task_struct *p)
{
#ifdef CONFIG_SMP
return p->on_cpu;
On Tue, Sep 06, 2022 at 12:54:34PM +0200, Peter Zijlstra wrote:
> > Suggestion #3:
> >
> > - Couldn't the following users with a 0 mask:
> >
> > drivers/powercap/idle_inject.c: wait_task_inactive(iit->tsk, 0);
> > fs/coredump.c: wait_task_inactive(ptr->task, 0);
> >
> > Use ~0 instead (exposed as TASK_ANY or so) and then we can drop the
> > !match_state special case?
> >
> > They'd do something like:
> >
> > drivers/powercap/idle_inject.c: wait_task_inactive(iit->tsk, TASK_ANY);
> > fs/coredump.c: wait_task_inactive(ptr->task, TASK_ANY);
> >
> > It's not an entirely 100% equivalent transformation though, but looks OK
> > at first sight: ->__state will be some nonzero mask for genuine tasks
> > waiting to schedule out, so any match will be functionally the same as a
> > 0 flag telling us not to check any of the bits, right? I might be missing
> > something though.
>
> I too am thinking that should work. Added patch for that.
---
Subject: sched: Add TASK_ANY for wait_task_inactive()
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue Sep 6 12:39:55 CEST 2022
Now that wait_task_inactive()'s @match_state argument is a mask (like
ttwu()) it is possible to replace the special !match_state case with
an 'all-states' value such that any blocked state will match.
Suggested-by: Ingo Molnar (mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
drivers/powercap/idle_inject.c | 2 +-
fs/coredump.c | 2 +-
include/linux/sched.h | 2 ++
kernel/sched/core.c | 16 ++++++++--------
4 files changed, 12 insertions(+), 10 deletions(-)
--- a/drivers/powercap/idle_inject.c
+++ b/drivers/powercap/idle_inject.c
@@ -254,7 +254,7 @@ void idle_inject_stop(struct idle_inject
iit = per_cpu_ptr(&idle_inject_thread, cpu);
iit->should_run = 0;
- wait_task_inactive(iit->tsk, 0);
+ wait_task_inactive(iit->tsk, TASK_ANY);
}
cpu_hotplug_enable();
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -412,7 +412,7 @@ static int coredump_wait(int exit_code,
*/
ptr = core_state->dumper.next;
while (ptr != NULL) {
- wait_task_inactive(ptr->task, 0);
+ wait_task_inactive(ptr->task, TASK_ANY);
ptr = ptr->next;
}
}
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -101,6 +101,8 @@ struct task_group;
#define TASK_RTLOCK_WAIT 0x1000
#define TASK_STATE_MAX 0x2000
+#define TASK_ANY (TASK_STATE_MAX-1)
+
/* Convenience macros for the sake of set_current_state: */
#define TASK_KILLABLE (TASK_WAKEKILL | TASK_UNINTERRUPTIBLE)
#define TASK_STOPPED (TASK_WAKEKILL | __TASK_STOPPED)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3254,12 +3254,12 @@ int migrate_swap(struct task_struct *cur
/*
* wait_task_inactive - wait for a thread to unschedule.
*
- * If @match_state is nonzero, it's the @p->state value just checked and
- * not expected to change. If it changes, i.e. @p might have woken up,
- * then return zero. When we succeed in waiting for @p to be off its CPU,
- * we return a positive number (its total switch count). If a second call
- * a short while later returns the same number, the caller can be sure that
- * @p has remained unscheduled the whole time.
+ * Wait for the thread to block in any of the states set in @match_state.
+ * If it changes, i.e. @p might have woken up, then return zero. When we
+ * succeed in waiting for @p to be off its CPU, we return a positive number
+ * (its total switch count). If a second call a short while later returns the
+ * same number, the caller can be sure that @p has remained unscheduled the
+ * whole time.
*
* The caller must ensure that the task *will* unschedule sometime soon,
* else this function might spin for a *long* time. This function can't
@@ -3295,7 +3295,7 @@ unsigned long wait_task_inactive(struct
* is actually now running somewhere else!
*/
while (task_on_cpu(rq, p)) {
- if (match_state && !(READ_ONCE(p->__state) & match_state))
+ if (!(READ_ONCE(p->__state) & match_state))
return 0;
cpu_relax();
}
@@ -3310,7 +3310,7 @@ unsigned long wait_task_inactive(struct
running = task_on_cpu(rq, p);
queued = task_on_rq_queued(p);
ncsw = 0;
- if (!match_state || (READ_ONCE(p->__state) & match_state))
+ if (READ_ONCE(p->__state) & match_state)
ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
task_rq_unlock(rq, p, &rf);
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 9204a97f7ae862fc8a3330ec8335917534c3fb63
Gitweb: https://git.kernel.org/tip/9204a97f7ae862fc8a3330ec8335917534c3fb63
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Mon, 22 Aug 2022 13:18:19 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 07 Sep 2022 21:53:48 +02:00
sched: Change wait_task_inactive()s match_state
Make wait_task_inactive()'s @match_state work like ttwu()'s @state.
That is, instead of an equal comparison, use it as a mask. This allows
matching multiple block conditions.
(removes the unlikely; it doesn't make sense how it's only part of the
condition)
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220822114648.856734578@infradead.org
---
kernel/sched/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1630181..43d71c6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3294,7 +3294,7 @@ unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state
* is actually now running somewhere else!
*/
while (task_on_cpu(rq, p)) {
- if (match_state && unlikely(READ_ONCE(p->__state) != match_state))
+ if (match_state && !(READ_ONCE(p->__state) & match_state))
return 0;
cpu_relax();
}
@@ -3309,7 +3309,7 @@ unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state
running = task_on_cpu(rq, p);
queued = task_on_rq_queued(p);
ncsw = 0;
- if (!match_state || READ_ONCE(p->__state) == match_state)
+ if (!match_state || (READ_ONCE(p->__state) & match_state))
ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
task_rq_unlock(rq, p, &rf);
© 2016 - 2025 Red Hat, Inc.