From nobody Sat Dec 27 01:12:24 2025 Received: from relay.virtuozzo.com (relay.virtuozzo.com [130.117.225.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A896524CC for ; Mon, 25 Dec 2023 15:45:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=virtuozzo.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=virtuozzo.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=virtuozzo.com header.i=@virtuozzo.com header.b="C/4Z9vgI" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=MIME-Version:Message-Id:Date:Subject:From: Content-Type; bh=HGaBq9ixKr+elVaP+1SDYfug/E6zNtqo5465zLHvM88=; b=C/4Z9vgINiPL DalMqX02e8xs08pOVG6Sa/vSxHpn6ni7MsipYkLlHEogcpwiBv3SPax6vaAYnVM2Snfoif0uq+xMh rEU1qvongQP+iEgj/8VRl05sT6o3Tg49++pPk7vigEcZNeech5ySEyGrXYMkXAkyP668ZOdI2YSOw rUTKK9oEXTQEyecgMgTJuf+1gCVDG1WMtaNiQ3y8zlx6YfmQ6IXsASc37jrW7C2we6Y3rMn1hgopp MLVkgKvkZnxsJubMKJaZrlh29SzEuVCKq02HGgTSyrR1SaJyGRkB5d0Y3QvBSMpwJ+kbnqi3gjLuK I6gxXjn6Gk3ak6059C9Hug==; Received: from [130.117.225.1] (helo=finist-alma9.sw.ru) by relay.virtuozzo.com with esmtp (Exim 4.96) (envelope-from ) id 1rHmrV-003pq0-1k; Mon, 25 Dec 2023 16:29:36 +0100 From: Konstantin Khorenko To: Vincent Guittot Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Alexander Atanasov , linux-kernel@vger.kernel.org Subject: [PATCH v2] sched/fair: Do not scan non-movable tasks several times Date: Mon, 25 Dec 2023 18:29:43 +0300 Message-Id: <20231225152943.2657849-1-khorenko@virtuozzo.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <2cf94373-4f25-4a33-a0b4-cab04031bae7@virtuozzo.com> References: <2cf94373-4f25-4a33-a0b4-cab04031bae7@virtuozzo.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If busiest rq is small, nr_running < SCHED_NR_MIGRATE_BREAK and all tasks are not movable, detach_tasks() should not iterate more than tasks available in the busiest rq. Before commit: b0defa7ae03e ("sched/fair: Make sure to try to detach at least one movable task"), the (env->loop > env->loop_max) condition prevented us from scanning non-movable tasks more than rq size times, but after we start checking the LBF_ALL_PINNED flag, the "all tasks are not movable" case is under threat. Note: in case all tasks in the rq could not be moved in detach_tasks() we always increase loop_break by SCHED_NR_MIGRATE_BREAK, so we can step over loop_max, but i think it's a rare case and does not worth adding here extra check for rq->nr_running overlimit. Fixes: b0defa7ae03e ("sched/fair: Make sure to try to detach at least one movable task") Signed-off-by: Konstantin Khorenko --- Changes: v1->v2: * added the exact commit id caused the unefficiency + Fixes: tag * dropped a couple of extra redundunt env.loop_break assignments in load_balance() kernel/sched/fair.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d7a3c63a2171..bd69c33fe9b4 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -11219,7 +11219,6 @@ static int load_balance(int this_cpu, struct rq *th= is_rq, .dst_rq =3D this_rq, .dst_grpmask =3D group_balance_mask(sd->groups), .idle =3D idle, - .loop_break =3D SCHED_NR_MIGRATE_BREAK, .cpus =3D cpus, .fbq_type =3D all, .tasks =3D LIST_HEAD_INIT(env.tasks), @@ -11266,6 +11265,14 @@ static int load_balance(int this_cpu, struct rq *t= his_rq, */ env.loop_max =3D min(sysctl_sched_nr_migrate, busiest->nr_running); =20 +more_balance_reset_break: + /* + * If busiest rq is small, nr_running < SCHED_NR_MIGRATE_BREAK + * and all tasks are not movable, detach_tasks() should not + * iterate more than tasks available in rq. + */ + env.loop_break =3D min(SCHED_NR_MIGRATE_BREAK, busiest->nr_running); + more_balance: rq_lock_irqsave(busiest, &rf); update_rq_clock(busiest); @@ -11328,13 +11335,12 @@ static int load_balance(int this_cpu, struct rq *= this_rq, env.dst_cpu =3D env.new_dst_cpu; env.flags &=3D ~LBF_DST_PINNED; env.loop =3D 0; - env.loop_break =3D SCHED_NR_MIGRATE_BREAK; =20 /* * Go back to "more_balance" rather than "redo" since we * need to continue with same src_cpu. */ - goto more_balance; + goto more_balance_reset_break; } =20 /* @@ -11360,7 +11366,6 @@ static int load_balance(int this_cpu, struct rq *th= is_rq, */ if (!cpumask_subset(cpus, env.dst_grpmask)) { env.loop =3D 0; - env.loop_break =3D SCHED_NR_MIGRATE_BREAK; goto redo; } goto out_all_pinned; --=20 2.39.3