[PATCH] sched: Fix rq nr_uninterruptible count

zhenggy posted 1 patch 2 years, 6 months ago
kernel/sched/core.c | 11 +++++++++++
1 file changed, 11 insertions(+)
[PATCH] sched: Fix rq nr_uninterruptible count
Posted by zhenggy 2 years, 6 months ago
When an uninterrptable task is queue to a differect cpu as where
it is dequeued, the rq nr_uninterruptible will be incorrent, so
fix it.

Signed-off-by: GuoYong Zheng <zhenggy@chinatelecom.cn>
---
 kernel/sched/core.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 25b582b..cd5ef6e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4068,6 +4068,7 @@ bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
 {
 	unsigned long flags;
 	int cpu, success = 0;
+	struct rq *src_rq, *dst_rq;

 	preempt_disable();
 	if (p == current) {
@@ -4205,6 +4206,16 @@ bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
 			atomic_dec(&task_rq(p)->nr_iowait);
 		}

+		if (p->sched_contributes_to_load) {
+			src_rq = cpu_rq(task_cpu(p));
+			dst_rq = cpu_rq(cpu);
+
+			double_rq_lock(src_rq, dst_rq);
+			src_rq->nr_uninterruptible--;
+			dst_rq->nr_uninterruptible++;
+			double_rq_unlock(src_rq, dst_rq);
+		}
+
 		wake_flags |= WF_MIGRATED;
 		psi_ttwu_dequeue(p);
 		set_task_cpu(p, cpu);
-- 
1.8.3.1
Re: [PATCH] sched: Fix rq nr_uninterruptible count
Posted by Nikolay Borisov 2 years, 6 months ago

On 28.02.23 г. 10:46 ч., zhenggy wrote:
> When an uninterrptable task is queue to a differect cpu as where
> it is dequeued, the rq nr_uninterruptible will be incorrent, so
> fix it.
> 
> Signed-off-by: GuoYong Zheng <zhenggy@chinatelecom.cn>


    37  *  - cpu_rq()->nr_uninterruptible isn't accurately tracked 
per-CPU because
    38  *    this would add another cross-CPU cacheline miss and atomic 
operation
    39  *    to the wakeup path. Instead we increment on whatever CPU 
the task ran
    40  *    when it went into uninterruptible state and decrement on 
whatever CPU
    41  *    did the wakeup. This means that only the sum of 
nr_uninterruptible over
    42  *    all CPUs yields the correct result.

> ---
>   kernel/sched/core.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 25b582b..cd5ef6e 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4068,6 +4068,7 @@ bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
>   {
>   	unsigned long flags;
>   	int cpu, success = 0;
> +	struct rq *src_rq, *dst_rq;
> 
>   	preempt_disable();
>   	if (p == current) {
> @@ -4205,6 +4206,16 @@ bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
>   			atomic_dec(&task_rq(p)->nr_iowait);
>   		}
> 
> +		if (p->sched_contributes_to_load) {
> +			src_rq = cpu_rq(task_cpu(p));
> +			dst_rq = cpu_rq(cpu);
> +
> +			double_rq_lock(src_rq, dst_rq);
> +			src_rq->nr_uninterruptible--;
> +			dst_rq->nr_uninterruptible++;
> +			double_rq_unlock(src_rq, dst_rq);
> +		}
> +
>   		wake_flags |= WF_MIGRATED;
>   		psi_ttwu_dequeue(p);
>   		set_task_cpu(p, cpu);