From nobody Wed Oct 1 23:32:08 2025 Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A348B2D12EB for ; Mon, 29 Sep 2025 09:22:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759137762; cv=none; b=QO4V2ksAfBuGLyiSHwFETwPF+QNBqa6f3pHKIELX6dX4gCLbarGUojmokUusSr8AvenIru7DN2t62oY9Zgr2SytZg0tpxxwe1uJ/1nD+HHjWBW6xTMihe+3JemI8jWgR4y5TwwL3XvxE5ZMA7EPlpfu34jiARXBPwbtu+7FGb6o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759137762; c=relaxed/simple; bh=fxG+bTxB07LCi9Ts9lL1Yl3nrlWVjIIv8vHQOIlnXQQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KBPzfaDzo2EIWnzAKDoySAQi6OWaa7TWEnY56SAJLUjGUybtsgmVoesM5yCC4FsK16CATHEAF2WtBmGq1WSKZNDiIZCviKAFPJyaLHiYxT6SSuinpJ+EsipxFlmQTmnF4O9VU4zn7IgWWpi0aIwwjTZT7I1EC1g+zBpe8mOSyJ4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lsjexBXM; arc=none smtp.client-ip=209.85.218.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lsjexBXM" Received: by mail-ej1-f42.google.com with SMTP id a640c23a62f3a-b3ee18913c0so133110666b.3 for ; Mon, 29 Sep 2025 02:22:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1759137758; x=1759742558; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sruGDAIuG3pHlYCEj+2RbEEqGhhXQxwPKteoyiFJfm0=; b=lsjexBXMu4H+YBug8YlV7mwJ0j3dvEUBuvgek5UrbJAISHOQyzJY7JE89wPy+oTm2p cuf5flMW2cZ5N+JIIlkSbnL/j4eOenR+9+OFPZ4QtoWJnut+TXhekKBUoFRLvK8g6DX6 Fo7kvywj1chZeSMxXu5jlpWb5N1QF0Vqzm+nVBmyD35JlmHwBPKccNLytDdKWms2gDlL NhEamDtsSVSofyBJgrlY+WdCciXpZuedo39cqWmrk2ETo4T8Q7hzxSuyZOnbdwgnCuvC sEFhG/1YBE2BpBcxyGF/02l1kmbLUyQDgfKj4FpR6O5whRllbnFGXTOqAScab2vj/NZW zWkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759137758; x=1759742558; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sruGDAIuG3pHlYCEj+2RbEEqGhhXQxwPKteoyiFJfm0=; b=a+3iTKJnm0QTk3D4dW/0lVRNX+RTBo653heiTdQt7+bgOylBsIl5NK85xNAP76RIRF Vm0g608gJ+km0qKXBT4+keNm8EoVawDkAu0+CRuGJu5gtE9vZ/LhmwLee9dVVUhWXDCS PFT/5pyf6ERQjw3WCz4BxSl70ahFlbjdzgk/8/AKht2O6LCY95B2H602EnJg5aErqQ/Y 4035o2o4T78CTTEHSdJmaw6VZ9w104ByVshImBDl+hxBc5WcqrGsSJFNQ/V4wggetqxr UOzfLb+C6mxh98hLuTp+y9Usc7bHhdKgbb7ouTv8X5cO+nuhndOINko9onkJNSo6AmTJ YFpw== X-Gm-Message-State: AOJu0Yx2TIwe/Rvt41cZdsHYIYSrzP9vD9laDvcdwXujRe+Ba0NZF2Hh oQ4X+YZIyDotpaQcQfi/Yo30I47rPEyjX4puT3wMmKNbTQWi1rTA3ss+ X-Gm-Gg: ASbGncsnGVMITqX70OMeDpqB2LVqihp2H700AMMbAmDzA8xcBjFfOCQB96MBsVlXP9u s/lOzdcLV2pREITjn7+6Y6ggxXLSQLGdgfEBO6Eum8ynZnYco8BlUePGQqIrEjhDD0SNU8TFWlM 2Ojk/hnAPRzMq/eLVC9MWxy7QF/tPJdPHYx17TKjal9oAcanOwREepXoZ5XJLXvEm78QDBhlHtx ADgrU+e9Vm+MSSXfFMDLL/ilDPdt91UCFzb9qtnGJznKfNcQZYXNaPaCXTz1QNUp3+ruzpDdYZB kj6WqH62JyMKJZSF2fQrT92IVx+rZgvAzTvNRM2Qh/vP+jyzPIA2anonOBRXrh3qvZ+5MZgSDle A7V3fm+JJec65L3Askr4Whzp1xg0kcg== X-Google-Smtp-Source: AGHT+IF7xM4d9a5Oa2BSywkR/hoC2YErIuLnwL9Uxt8xadFiyU0NjsTWrgCgLmXYsxeiJSgifomyBg== X-Received: by 2002:a17:907:3d91:b0:b0e:d477:4961 with SMTP id a640c23a62f3a-b34b8d96245mr1725701166b.23.1759137757691; Mon, 29 Sep 2025 02:22:37 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b3e89655b09sm181082366b.77.2025.09.29.02.22.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Sep 2025 02:22:37 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v3 19/24] sched/rt: Add rt-cgroup migration Date: Mon, 29 Sep 2025 11:22:16 +0200 Message-ID: <20250929092221.10947-20-yurand2000@gmail.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20250929092221.10947-1-yurand2000@gmail.com> References: <20250929092221.10947-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni When the runtime is exhausted in a RT CGroup, the scheduler checks for another non-throttled runqueue and, if available, migrates the tasks. The bandwidth (runtime/period) chosen for a certain cgroup is replicated on every core of the system, therefore, in an SMP system with M cores, the total available bandwidth is the given runtime/period multiplied by M. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: Yuri Andriaccio Signed-off-by: Yuri Andriaccio Signed-off-by: luca abeni --- kernel/sched/rt.c | 495 +++++++++++++++++++++++++++++++++++++++++++ kernel/sched/sched.h | 10 + 2 files changed, 505 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 17ad91261cb..1fbcac14fc9 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1,3 +1,4 @@ +#pragma GCC diagnostic ignored "-Wunused-function" // SPDX-License-Identifier: GPL-2.0 /* * Real-Time Scheduling Class (mapped to the SCHED_FIFO and SCHED_RR @@ -84,6 +85,8 @@ void init_rt_rq(struct rt_rq *rt_rq) plist_head_init(&rt_rq->pushable_tasks); } =20 +static void group_pull_rt_task(struct rt_rq *this_rt_rq); + #ifdef CONFIG_RT_GROUP_SCHED =20 void unregister_rt_sched_group(struct task_group *tg) @@ -136,6 +139,7 @@ void init_tg_rt_entry(struct task_group *tg, struct rq = *served_rq, struct sched_dl_entity *dl_se, int cpu, struct sched_dl_entity *parent) { + served_rq->cpu =3D cpu; served_rq->rt.highest_prio.curr =3D MAX_RT_PRIO-1; served_rq->rt.rq =3D cpu_rq(cpu); served_rq->rt.tg =3D tg; @@ -299,6 +303,53 @@ static inline void rt_queue_pull_task(struct rt_rq * r= t_rq) queue_balance_callback(rq, &per_cpu(rt_pull_head, rq->cpu), pull_rt_task); } =20 +#ifdef CONFIG_RT_GROUP_SCHED +static DEFINE_PER_CPU(struct balance_callback, rt_group_push_head); +static DEFINE_PER_CPU(struct balance_callback, rt_group_pull_head); +static void group_push_rt_tasks_callback(struct rq *); +static void group_pull_rt_task_callback(struct rq *); + +static void rt_queue_push_from_group(struct rt_rq *rt_rq) +{ + struct rq *rq =3D container_of(rt_rq, struct rq, rt); + struct rq *global_rq =3D cpu_rq(rq->cpu); + + BUG_ON(rt_rq =3D=3D NULL); + BUG_ON(rt_rq->rq !=3D global_rq); + + if (global_rq->rq_to_push_from) + return; + + if (!has_pushable_tasks(rt_rq)) + return; + + global_rq->rq_to_push_from =3D rq; + queue_balance_callback(global_rq, &per_cpu(rt_group_push_head, global_rq-= >cpu), + group_push_rt_tasks_callback); +} + +static void rt_queue_pull_to_group(struct rt_rq *rt_rq) +{ + struct rq *rq =3D container_of(rt_rq, struct rq, rt); + struct rq *global_rq =3D cpu_rq(rq->cpu); + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + BUG_ON(rt_rq =3D=3D NULL); + BUG_ON(!is_dl_group(rt_rq)); + BUG_ON(rt_rq->rq !=3D global_rq); + + if (dl_se->dl_throttled || global_rq->rq_to_pull_to) + return; + + global_rq->rq_to_pull_to =3D rq; + queue_balance_callback(global_rq, &per_cpu(rt_group_pull_head, global_rq-= >cpu), + group_pull_rt_task_callback); +} +#else +static inline void rt_queue_push_from_group(struct rt_rq *rt_rq) {}; +static inline void rt_queue_pull_to_group(struct rt_rq *rt_rq) {}; +#endif + static void enqueue_pushable_task(struct rt_rq *rt_rq, struct task_struct = *p) { plist_del(&p->pushable_tasks, &rt_rq->pushable_tasks); @@ -1303,6 +1354,8 @@ static struct rq *find_lock_lowest_rq(struct task_str= uct *task, struct rq *rq) */ static int push_rt_task(struct rq *rq, bool pull) { + BUG_ON(is_dl_group(&rq->rt)); + struct task_struct *next_task; struct rq *lowest_rq; int ret =3D 0; @@ -1599,6 +1652,8 @@ void rto_push_irq_work_func(struct irq_work *work) =20 static void pull_rt_task(struct rq *this_rq) { + BUG_ON(is_dl_group(&this_rq->rt)); + int this_cpu =3D this_rq->cpu, cpu; bool resched =3D false; struct task_struct *p, *push_task; @@ -1708,6 +1763,446 @@ static void pull_rt_task(struct rq *this_rq) resched_curr(this_rq); } =20 +#ifdef CONFIG_RT_GROUP_SCHED +/* + * Find the lowest priority runqueue among the runqueues of the same + * task group. Unlike find_lowest_rt(), this does not mean that the + * lowest priority cpu is running tasks from this runqueue. + */ +static int group_find_lowest_rt_rq(struct task_struct *task, struct rt_rq*= task_rt_rq) +{ + struct sched_domain *sd; + struct cpumask mask, *lowest_mask =3D &mask; + struct sched_dl_entity *dl_se; + struct rt_rq *rt_rq; + int prio, lowest_prio; + int cpu, this_cpu =3D smp_processor_id(); + + BUG_ON(task->sched_task_group !=3D task_rt_rq->tg); + + if (task->nr_cpus_allowed =3D=3D 1) + return -1; /* No other targets possible */ + + lowest_prio =3D task->prio - 1; + cpumask_clear(lowest_mask); + for_each_cpu_and(cpu, cpu_online_mask, task->cpus_ptr) { + dl_se =3D task_rt_rq->tg->dl_se[cpu]; + rt_rq =3D &dl_se->my_q->rt; + prio =3D rt_rq->highest_prio.curr; + + /* + * If we're on asym system ensure we consider the different capacities + * of the CPUs when searching for the lowest_mask. + */ + if (dl_se->dl_throttled || !rt_task_fits_capacity(task, cpu)) + continue; + + if (prio >=3D lowest_prio) { + if (prio > lowest_prio) { + cpumask_clear(lowest_mask); + lowest_prio =3D prio; + } + + cpumask_set_cpu(cpu, lowest_mask); + } + } + + if (cpumask_empty(lowest_mask)) + return -1; + + /* + * At this point we have built a mask of CPUs representing the + * lowest priority tasks in the system. Now we want to elect + * the best one based on our affinity and topology. + * + * We prioritize the last CPU that the task executed on since + * it is most likely cache-hot in that location. + */ + cpu =3D task_cpu(task); + if (cpumask_test_cpu(cpu, lowest_mask)) + return cpu; + + /* + * Otherwise, we consult the sched_domains span maps to figure + * out which CPU is logically closest to our hot cache data. + */ + if (!cpumask_test_cpu(this_cpu, lowest_mask)) + this_cpu =3D -1; /* Skip this_cpu opt if not among lowest */ + + rcu_read_lock(); + for_each_domain(cpu, sd) { + if (sd->flags & SD_WAKE_AFFINE) { + int best_cpu; + + /* + * "this_cpu" is cheaper to preempt than a + * remote processor. + */ + if (this_cpu !=3D -1 && + cpumask_test_cpu(this_cpu, sched_domain_span(sd))) { + rcu_read_unlock(); + return this_cpu; + } + + best_cpu =3D cpumask_any_and_distribute(lowest_mask, + sched_domain_span(sd)); + if (best_cpu < nr_cpu_ids) { + rcu_read_unlock(); + return best_cpu; + } + } + } + rcu_read_unlock(); + + /* + * And finally, if there were no matches within the domains + * just give the caller *something* to work with from the compatible + * locations. + */ + if (this_cpu !=3D -1) + return this_cpu; + + cpu =3D cpumask_any_distribute(lowest_mask); + if (cpu < nr_cpu_ids) + return cpu; + + return -1; +} + +/* + * Find and lock the lowest priority runqueue among the runqueues + * of the same task group. Unlike find_lock_lowest_rt(), this does not + * mean that the lowest priority cpu is running tasks from this runqueue. + */ +static struct rt_rq* group_find_lock_lowest_rt_rq(struct task_struct *task= , struct rt_rq *rt_rq) +{ + struct rq *rq =3D rq_of_rt_rq(rt_rq); + struct rq *lowest_rq; + struct rt_rq *lowest_rt_rq =3D NULL; + struct sched_dl_entity *lowest_dl_se; + int tries, cpu; + + BUG_ON(task->sched_task_group !=3D rt_rq->tg); + + for (tries =3D 0; tries < RT_MAX_TRIES; tries++) { + cpu =3D group_find_lowest_rt_rq(task, rt_rq); + + if ((cpu =3D=3D -1) || (cpu =3D=3D rq->cpu)) + break; + + lowest_dl_se =3D rt_rq->tg->dl_se[cpu]; + lowest_rt_rq =3D &lowest_dl_se->my_q->rt; + lowest_rq =3D cpu_rq(cpu); + + if (lowest_rt_rq->highest_prio.curr <=3D task->prio) { + /* + * Target rq has tasks of equal or higher priority, + * retrying does not release any lock and is unlikely + * to yield a different result. + */ + lowest_rt_rq =3D NULL; + break; + } + + /* if the prio of this runqueue changed, try again */ + if (double_lock_balance(rq, lowest_rq)) { + /* + * We had to unlock the run queue. In + * the mean time, task could have + * migrated already or had its affinity changed. + * Also make sure that it wasn't scheduled on its rq. + * It is possible the task was scheduled, set + * "migrate_disabled" and then got preempted, so we must + * check the task migration disable flag here too. + */ + if (unlikely(is_migration_disabled(task) || + lowest_dl_se->dl_throttled || + !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) || + task !=3D pick_next_pushable_task(rt_rq))) { + + double_unlock_balance(rq, lowest_rq); + lowest_rt_rq =3D NULL; + break; + } + } + + /* If this rq is still suitable use it. */ + if (lowest_rt_rq->highest_prio.curr > task->prio) + break; + + /* try again */ + double_unlock_balance(rq, lowest_rq); + lowest_rt_rq =3D NULL; + } + + return lowest_rt_rq; +} + +static int group_push_rt_task(struct rt_rq *rt_rq, bool pull) +{ + BUG_ON(!is_dl_group(rt_rq)); + + struct rq *rq =3D rq_of_rt_rq(rt_rq); + struct task_struct *next_task; + struct rq *lowest_rq; + struct rt_rq *lowest_rt_rq; + int ret =3D 0; + + if (!rt_rq->overloaded) + return 0; + + next_task =3D pick_next_pushable_task(rt_rq); + if (!next_task) + return 0; + +retry: + if (is_migration_disabled(next_task)) { + struct task_struct *push_task =3D NULL; + int cpu; + + if (!pull || rq->push_busy) + return 0; + + /* + * If the current task does not belong to the same task group + * we cannot push it away. + */ + if (rq->curr->sched_task_group !=3D rt_rq->tg) + return 0; + + /* + * Invoking group_find_lowest_rt_rq() on anything but an RT task doesn't + * make sense. Per the above priority check, curr has to + * be of higher priority than next_task, so no need to + * reschedule when bailing out. + * + * Note that the stoppers are masqueraded as SCHED_FIFO + * (cf. sched_set_stop_task()), so we can't rely on rt_task(). + */ + if (rq->curr->sched_class !=3D &rt_sched_class) + return 0; + + cpu =3D group_find_lowest_rt_rq(rq->curr, rt_rq); + if (cpu =3D=3D -1 || cpu =3D=3D rq->cpu) + return 0; + + /* + * Given we found a CPU with lower priority than @next_task, + * therefore it should be running. However we cannot migrate it + * to this other CPU, instead attempt to push the current + * running task on this CPU away. + */ + push_task =3D get_push_task(rq); + if (push_task) { + preempt_disable(); + raw_spin_rq_unlock(rq); + stop_one_cpu_nowait(rq->cpu, push_cpu_stop, + push_task, &rq->push_work); + preempt_enable(); + raw_spin_rq_lock(rq); + } + + return 0; + } + + if (WARN_ON(next_task =3D=3D rq->curr)) + return 0; + + /* We might release rq lock */ + get_task_struct(next_task); + + /* group_find_lock_lowest_rq locks the rq if found */ + lowest_rt_rq =3D group_find_lock_lowest_rt_rq(next_task, rt_rq); + if (!lowest_rt_rq) { + struct task_struct *task; + /* + * group_find_lock_lowest_rt_rq releases rq->lock + * so it is possible that next_task has migrated. + * + * We need to make sure that the task is still on the same + * run-queue and is also still the next task eligible for + * pushing. + */ + task =3D pick_next_pushable_task(rt_rq); + if (task =3D=3D next_task) { + /* + * The task hasn't migrated, and is still the next + * eligible task, but we failed to find a run-queue + * to push it to. Do not retry in this case, since + * other CPUs will pull from us when ready. + */ + goto out; + } + + if (!task) + /* No more tasks, just exit */ + goto out; + + /* + * Something has shifted, try again. + */ + put_task_struct(next_task); + next_task =3D task; + goto retry; + } + + lowest_rq =3D rq_of_rt_rq(lowest_rt_rq); + + move_queued_task_locked(rq, lowest_rq, next_task); + resched_curr(lowest_rq); + ret =3D 1; + + double_unlock_balance(rq, lowest_rq); +out: + put_task_struct(next_task); + + return ret; +} + +static void group_pull_rt_task(struct rt_rq *this_rt_rq) +{ + BUG_ON(!is_dl_group(this_rt_rq)); + + struct rq *this_rq =3D rq_of_rt_rq(this_rt_rq); + int this_cpu =3D this_rq->cpu, cpu; + bool resched =3D false; + struct task_struct *p, *push_task =3D NULL; + struct rt_rq *src_rt_rq; + struct rq *src_rq; + struct sched_dl_entity *src_dl_se; + + for_each_online_cpu(cpu) { + if (this_cpu =3D=3D cpu) + continue; + + src_dl_se =3D this_rt_rq->tg->dl_se[cpu]; + src_rt_rq =3D &src_dl_se->my_q->rt; + + if (src_rt_rq->rt_nr_running <=3D 1 && !src_dl_se->dl_throttled) + continue; + + src_rq =3D rq_of_rt_rq(src_rt_rq); + + /* + * Don't bother taking the src_rq->lock if the next highest + * task is known to be lower-priority than our current task. + * This may look racy, but if this value is about to go + * logically higher, the src_rq will push this task away. + * And if its going logically lower, we do not care + */ + if (src_rt_rq->highest_prio.next >=3D + this_rt_rq->highest_prio.curr) + continue; + + /* + * We can potentially drop this_rq's lock in + * double_lock_balance, and another CPU could + * alter this_rq + */ + push_task =3D NULL; + double_lock_balance(this_rq, src_rq); + + /* + * We can pull only a task, which is pushable + * on its rq, and no others. + */ + p =3D pick_highest_pushable_task(src_rt_rq, this_cpu); + + /* + * Do we have an RT task that preempts + * the to-be-scheduled task? + */ + if (p && (p->prio < this_rt_rq->highest_prio.curr)) { + WARN_ON(p =3D=3D src_rq->curr); + WARN_ON(!task_on_rq_queued(p)); + + /* + * There's a chance that p is higher in priority + * than what's currently running on its CPU. + * This is just that p is waking up and hasn't + * had a chance to schedule. We only pull + * p if it is lower in priority than the + * current task on the run queue + */ + if (src_rq->curr->sched_task_group =3D=3D this_rt_rq->tg && + p->prio < src_rq->curr->prio) + goto skip; + + if (is_migration_disabled(p)) { + /* + * If the current task does not belong to the same task group + * we cannot push it away. + */ + if (src_rq->curr->sched_task_group !=3D this_rt_rq->tg) + goto skip; + + push_task =3D get_push_task(src_rq); + } else { + move_queued_task_locked(src_rq, this_rq, p); + resched =3D true; + } + /* + * We continue with the search, just in + * case there's an even higher prio task + * in another runqueue. (low likelihood + * but possible) + */ + } +skip: + double_unlock_balance(this_rq, src_rq); + + if (push_task) { + preempt_disable(); + raw_spin_rq_unlock(this_rq); + stop_one_cpu_nowait(src_rq->cpu, push_cpu_stop, + push_task, &src_rq->push_work); + preempt_enable(); + raw_spin_rq_lock(this_rq); + } + } + + if (resched) + resched_curr(this_rq); +} + +static void group_push_rt_tasks(struct rt_rq *rt_rq) +{ + while (group_push_rt_task(rt_rq, false)) + ; +} + +static void group_push_rt_tasks_callback(struct rq *global_rq) +{ + struct rt_rq *rt_rq =3D &global_rq->rq_to_push_from->rt; + + BUG_ON(global_rq->rq_to_push_from =3D=3D NULL); + BUG_ON(rt_rq->rq !=3D global_rq); + + if ((rt_rq->rt_nr_running > 1) || + (dl_group_of(rt_rq)->dl_throttled =3D=3D 1)) { + + while (group_push_rt_task(rt_rq, false)) + ; + } + + global_rq->rq_to_push_from =3D NULL; +} + +static void group_pull_rt_task_callback(struct rq *global_rq) +{ + struct rt_rq *rt_rq =3D &global_rq->rq_to_pull_to->rt; + + BUG_ON(global_rq->rq_to_pull_to =3D=3D NULL); + BUG_ON(rt_rq->rq !=3D global_rq); + + group_pull_rt_task(rt_rq); + global_rq->rq_to_pull_to =3D NULL; +} +#else /* CONFIG_RT_GROUP_SCHED */ +static void group_pull_rt_task(struct rt_rq *this_rt_rq) { } +static void group_push_rt_tasks(struct rt_rq *rt_rq) { } +#endif /* CONFIG_RT_GROUP_SCHED */ + /* * If we are not running and we are not going to reschedule soon, we should * try to push tasks away now diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 55631d93e02..4e287aaa72f 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1282,6 +1282,16 @@ struct rq { call_single_data_t cfsb_csd; struct list_head cfsb_csd_list; #endif + +#ifdef CONFIG_RT_GROUP_SCHED + /* + * Balance callbacks operate only on global runqueues. + * These pointers allow referencing cgroup specific runqueues + * for balancing operations. + */ + struct rq *rq_to_push_from; + struct rq *rq_to_pull_to; +#endif }; =20 #ifdef CONFIG_FAIR_GROUP_SCHED --=20 2.51.0