From nobody Fri Jun 12 18:35:46 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E3063B7B9E for ; Wed, 13 May 2026 09:13:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778663607; cv=none; b=nDRVmh1WzBpkRXDwnnJ5MWyrWRGAF9p8+jezwN+I7ZuSL/LH5enwM5BDkLePK7wVCOQhOIJKj6nx345JWgXNTuns+lr1k4IsdiAhUEUQjF68Px55RNSQttj9+BqzFzlhWC8iyScIiu/FeeQmOXGO5Fzuo9F3IrEdkIwcgsDCXPk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778663607; c=relaxed/simple; bh=6lxsNSsrJo8ikH11WPKSeLcHZ05Uc4NP5OViWAScqrQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:Cc; b=otSbso0qp6EgPb8ZF8K1eP+KspQv+Qn1N+fCMm0h1jphEnEGuwvGS0cq27FYakJS+Ox2SULkR2dozSA0r5zAkMewNAvWI/HIc61POxiB70HnNcshudeq9L9pH0lXzht/OhzbY7A0TO4c2DO/e797NVR/SVruX9DxGwmMQYOQ8+A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=bCrStA0m; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="bCrStA0m" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1778663604; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=lVINlqB7jzUQ5F4y9CzvHlPI5Vq7x78JWJEoBrwIkW0=; b=bCrStA0m/lPAugW/+FMGJWnj5dtXWDHPljpeg4Wn1s0tn52ZmK0YNIGBFRlZtp0D+YOtHa pUvgb3GXZ/i+QYyGz7IENauhuQNK8HZYVUlfp3wPw4+BDTMuOKXjtjZytnh2DVpKkSAXhL Z1gafYaAbBua3K1NC4+uBFTWy+E1Fyo= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-650-mMjb9JAcOcOcOcWbKIl_lg-1; Wed, 13 May 2026 05:13:20 -0400 X-MC-Unique: mMjb9JAcOcOcOcWbKIl_lg-1 X-Mimecast-MFC-AGG-ID: mMjb9JAcOcOcOcWbKIl_lg_1778663598 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id DA1D8195608A; Wed, 13 May 2026 09:13:17 +0000 (UTC) Received: from jlelli-thinkpadt14gen4.remote.csb (headnet01.pony-001.prod.iad2.dc.redhat.com [10.2.32.101]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8214319560A2; Wed, 13 May 2026 09:13:12 +0000 (UTC) From: Juri Lelli Date: Wed, 13 May 2026 11:13:03 +0200 Subject: [PATCH v2] sched/deadline: Make dl-server nohz full aware Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260513-upstream-fix-dlserver-nohzfull-b4-v2-1-d3e9cbe5c845@redhat.com> X-B4-Tracking: v=1; b=H4sIAAAAAAAC/yWNyw6CMBBFf4XM2kloRXz8inHRwlRqaiEzlBgJ/ +6oy5Oce+4KQhxJ4FKtwLREiWNWsLsKusHlO2HslcHWtq0PZo9lkpnJPTHEF/ZJ5wsx5nF4h5I S+gaDOzbGWd+a0xm0MzGp+/u43v4sxT+om79h2LYPabshFYUAAAA= X-Change-ID: 20260513-upstream-fix-dlserver-nohzfull-b4-fa741a2b6189 To: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Andrea Righi , Frederic Weisbecker Cc: linux-kernel@vger.kernel.org, David Haufe , Cao Ruichuang , =?utf-8?q?Furkan_=C3=87al=C4=B1=C5=9Fkan?= , Juri Lelli X-Developer-Signature: v=1; a=ed25519-sha256; t=1778663592; l=6072; i=juri.lelli@redhat.com; s=20250626; h=from:subject:message-id; bh=6lxsNSsrJo8ikH11WPKSeLcHZ05Uc4NP5OViWAScqrQ=; b=Fpch7IR4Uo3RuzzsD0igwmGStRGzsZVMrPzbkt53sEeCJG0gqGYCIRHJEjpHZGf90x2XgUo2L gWw37HnLmLKBXtNgdsWaFAnqpxvWN+gSLD+hlP3KJJR/wf+LVce/sl3 X-Developer-Key: i=juri.lelli@redhat.com; a=ed25519; pk=kSwf88oiY/PYrNMRL/tjuBPiSGzc+U3bD13Zag6wO5Q= X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 The dl_server_timer() originally caused spurious IPIs on nohz_full cores, breaking isolation guarantees. While such IPIs cannot be observed on recent kernels, dl-server timers for tick-stopped isolated CPUs still fire unnecessarily on housekeeping cores. The problem is that dl-servers are not coordinated with nohz_full tick state. Even when the tick stops on an isolated CPU, its dl-server timer continues to fire on housekeeping, wasting cycles and potentially affecting housekeeping CPU performance. Fix by managing servers in sched_can_stop_tick(): - When RT tasks run with CFS/SCX tasks, start the appropriate server(s) and keep the tick running - When only RT tasks remain, stop all servers and allow tick to stop (except for >1 RR tasks which need the tick for round-robin) - When only CFS/SCX tasks remain, stop all servers before stopping tick Introduce dl_servers_stop_all() to reduce duplication and abstract server management from core.c. Unify RT handling into one block that handles both RR and FIFO cases. Note on SCX: While SCX is incompatible with isolcpus=3Ddomain, it does support nohz_full. The ext_server handling in this patch targets nohz_full configurations without domain isolation. Fixes: 557a6bfc662c ("sched/fair: Add trivial fair server") Reported-by: David Haufe Closes: https://lore.kernel.org/lkml/CAKJHwtOw_G67edzuHVtL1xC5Vyt6StcZzihtD= d0yaKudW=3DrwVw@mail.gmail.com Signed-off-by: Juri Lelli Reviewed-by: Andrea Righi Reviewed-by: Valentin Schneider --- Changes from v1 [1] - Fix CFS/SCX server start logic to handle both simultaneously in partial switch mode (Furkan) - Clarify in commit message that SCX supports nohz_full despite isolcpus=3Ddomain incompatibility (Andrea) 1 - https://lore.kernel.org/lkml/20260512-upstream-fix-dlserver-nohzfull-b4= -v1-1-a94844387ae7@redhat.com/ --- kernel/sched/core.c | 46 +++++++++++++++++++++++++++------------------- kernel/sched/deadline.c | 14 ++++++++++++++ kernel/sched/sched.h | 1 + 3 files changed, 42 insertions(+), 19 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b905805bbcbe4..6d05ce9b1dfe6 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1414,30 +1414,40 @@ static inline bool __need_bw_check(struct rq *rq, s= truct task_struct *p) =20 bool sched_can_stop_tick(struct rq *rq) { - int fifo_nr_running; - /* Deadline tasks, even if single, need the tick */ if (rq->dl.dl_nr_running) return false; =20 /* - * If there are more than one RR tasks, we need the tick to affect the - * actual RR behaviour. + * If there are RT tasks, we may need the tick (for >1 RR tasks), + * but we must also service lower-priority CFS/SCX tasks via dl-servers. */ - if (rq->rt.rr_nr_running) { - if (rq->rt.rr_nr_running =3D=3D 1) - return true; - else + if (rq->rt.rt_nr_running) { + bool cfs_or_scx_queued =3D false; + + if (rq->cfs.h_nr_queued) { + dl_server_start(&rq->fair_server); + cfs_or_scx_queued =3D true; + } +#ifdef CONFIG_SCHED_CLASS_EXT + if (rq->scx.nr_running) { + dl_server_start(&rq->ext_server); + cfs_or_scx_queued =3D true; + } +#endif + if (cfs_or_scx_queued) return false; - } =20 - /* - * If there's no RR tasks, but FIFO tasks, we can skip the tick, no - * forced preemption between FIFO tasks. - */ - fifo_nr_running =3D rq->rt.rt_nr_running - rq->rt.rr_nr_running; - if (fifo_nr_running) + /* + * Only RT tasks, no CFS/SCX. Stop servers to prevent spurious + * wakeups. Tick can stop for single RR or any FIFO, but must + * run for multiple RR (round-robin behavior). + */ + dl_servers_stop_all(rq); + if (rq->rt.rr_nr_running > 1) + return false; return true; + } =20 /* * If there are no DL,RR/FIFO tasks, there must only be CFS or SCX tasks @@ -1462,6 +1472,7 @@ bool sched_can_stop_tick(struct rq *rq) return false; } =20 + dl_servers_stop_all(rq); return true; } #endif /* CONFIG_NO_HZ_FULL */ @@ -8810,10 +8821,7 @@ int sched_cpu_dying(unsigned int cpu) WARN(true, "Dying CPU not properly vacated!"); dump_rq_tasks(rq, KERN_WARNING); } - dl_server_stop(&rq->fair_server); -#ifdef CONFIG_SCHED_CLASS_EXT - dl_server_stop(&rq->ext_server); -#endif + dl_servers_stop_all(rq); rq_unlock_irqrestore(rq, &rf); =20 calc_load_migrate(rq); diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index edca7849b165d..c2b3d6bbe4828 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1826,6 +1826,20 @@ void dl_server_stop(struct sched_dl_entity *dl_se) dl_se->dl_server_active =3D 0; } =20 +/* + * Stop all dl-servers on this runqueue. Called when transitioning to a st= ate + * where the tick can be stopped (e.g., single RR/FIFO task, or no RT task= s). + * This ensures server timers are disarmed and won't cause spurious wakeup= s on + * nohz_full isolated cores. + */ +void dl_servers_stop_all(struct rq *rq) +{ + dl_server_stop(&rq->fair_server); +#ifdef CONFIG_SCHED_CLASS_EXT + dl_server_stop(&rq->ext_server); +#endif +} + void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq, dl_server_pick_f pick_task) { diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 9f63b15d309d1..26cf1d14efde5 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -412,6 +412,7 @@ extern void dl_server_update_idle(struct sched_dl_entit= y *dl_se, s64 delta_exec) extern void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec= ); extern void dl_server_start(struct sched_dl_entity *dl_se); extern void dl_server_stop(struct sched_dl_entity *dl_se); +extern void dl_servers_stop_all(struct rq *rq); extern void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq, dl_server_pick_f pick_task); extern void sched_init_dl_servers(void); --- base-commit: 4ac4d6549a6563878d7c19c154e017f6cb7114d3 change-id: 20260513-upstream-fix-dlserver-nohzfull-b4-fa741a2b6189 Best regards, -- =20 Juri Lelli