From nobody Sat Feb 7 08:02:48 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6F9F202F8C for ; Mon, 2 Dec 2024 14:08:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733148491; cv=none; b=mkUzB4bWpNNL/DrG/Ga5491LTCllW8guz1Vyc/brhmZhsNlfthcA+KX1hK4imeZjSWCfrJSEygU1SVf8QAjPl7fqRCyVY6hfoXEUUTjq5O8G8E52l+hgeQU5AhJUNmcCt91pdZiSqEy25tLZodI2CEbAFBBeFGsbqgpl8GNloSQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733148491; c=relaxed/simple; bh=urZqL0IIQs/nJCpHZzmmxLsMB/82JCR8IGuqKHfiajw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Nib57kfPzg007Sx5agm8u6xtrPFyEst7QbAJtj1TKS5SxStTAkSofODxT0M8jnW2iriOPpSu9+2p7DMMvpJDfDTMs87z+7zdhPGhopfy9P9HfJBp9zObLaMDr/dIUWTsR4Z5/4TQz7zzMs0PlfNXhMiSOHDSlS3bdiWm5iOhJYM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=LerfOYuo; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LerfOYuo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733148488; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PMoX0G1hSDcZUUvjrY2vZSqmFJo+w98zvmlvJICE4UA=; b=LerfOYuoSbhHIA55ulcNfVoZuGcB7iUVQ2012BVE1dFbIjtN+Yq7P7qU61miITzXvcHnKY bPgoeY3MUFp6GxsnwPvDQt6ewsIcUcI1uce9RdCmaIZzUMSEpaT5GdR/D2E7ag90v9YLZG Z4SIQVCJILGv7T43a9B3pso8bQY+nu4= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-631-ZB-tYn4CNi-KkxKCLO3hIQ-1; Mon, 02 Dec 2024 09:08:05 -0500 X-MC-Unique: ZB-tYn4CNi-KkxKCLO3hIQ-1 X-Mimecast-MFC-AGG-ID: ZB-tYn4CNi-KkxKCLO3hIQ Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 0991B1955D91; Mon, 2 Dec 2024 14:08:04 +0000 (UTC) Received: from gmonaco-thinkpadt14gen3.rmtit.com (unknown [10.39.192.39]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B4874195605A; Mon, 2 Dec 2024 14:07:59 +0000 (UTC) From: Gabriele Monaco To: Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , linux-kernel@vger.kernel.org Cc: Gabriele Monaco Subject: [PATCH 1/2] sched: Optimise task_mm_cid_work duration Date: Mon, 2 Dec 2024 15:07:34 +0100 Message-ID: <20241202140735.56368-2-gmonaco@redhat.com> In-Reply-To: <20241202140735.56368-1-gmonaco@redhat.com> References: <20241202140735.56368-1-gmonaco@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 Content-Type: text/plain; charset="utf-8" The current behaviour of task_mm_cid_work is to loop through all possible CPUs twice to clean up old mm_cid remotely, this can be a waste of resources especially on tasks with a CPU affinity. This patch reduces the CPUs involved in the remote CID cleanup carried on by task_mm_cid_work. Using the mm_cidmask for the remote cleanup can considerably reduce the function runtime in highly isolated environments, where each process has affinity to a single core. Likewise, in the worst case, the mask is equivalent to all possible CPUs and we don't see any difference with the current behaviour. Signed-off-by: Gabriele Monaco --- kernel/sched/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 95e40895a519..57b50b5952fa 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10553,14 +10553,14 @@ static void task_mm_cid_work(struct callback_head= *work) return; cidmask =3D mm_cidmask(mm); /* Clear cids that were not recently used. */ - for_each_possible_cpu(cpu) + for_each_cpu_from(cpu, cidmask) sched_mm_cid_remote_clear_old(mm, cpu); weight =3D cpumask_weight(cidmask); /* * Clear cids that are greater or equal to the cidmask weight to * recompact it. */ - for_each_possible_cpu(cpu) + for_each_cpu_from(cpu, cidmask) sched_mm_cid_remote_clear_weight(mm, cpu, weight); } =20 --=20 2.47.0 From nobody Sat Feb 7 08:02:48 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB3F62040B8 for ; Mon, 2 Dec 2024 14:08:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733148500; cv=none; b=V6ziGiOvcpRg9ejqmIWFFi5o+dX0IkvotQKwBnROqcMepYcSlb43U6gz3EJeNs5VHDqNJ4Xf3BONbJsyEO358J0lOJzDpuyCfxKx1vmeZ8iXYKYkAOA7Je78R075WkJrzWF+YB28+gM9ySHTbvpetVPrtBUEK1315NFtemzei1I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733148500; c=relaxed/simple; bh=Hij2+sULPq2Tuzp6PTd3JydwnzDKFySOX4zpMIHuJJ4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=l73AT+b4S476Q/lJ+xj28ilt/NiFlwquqmRqi3zpMcwqckRTJ65pWz67nNH/z2/RHywxkjG7PcNYRISJU/DOUCfChZ92T8bm9AqTjxO/ObPVnUhFcEXyTN9JlkMep8b4eXv3gZiCxTrtpjWP9leP/bCzNIjO1dDR78vc+aruJ08= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=QVrZklnq; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QVrZklnq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733148497; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LmnCD0l9UZTLjIVb+9EMwI4IkQHmNiDDiH/3BqQlf3c=; b=QVrZklnq7W60t7Fcxmaxvv5lNqsy7wvNVExeyr79C7wT9yNu14bWhM/J+QhaIUwWbqpCWC hx3RxEgC1MUJodqkganQj1xj77yf/nUtCUPYaLIGyWfk9Z7PW/bkzMgAsmm8TA1LoIhRRh 5ZoDcyJGaSifNwuuoXAPljykxAm+8O4= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-693-5YW4fO_OOduglG3zpZ3YlA-1; Mon, 02 Dec 2024 09:08:13 -0500 X-MC-Unique: 5YW4fO_OOduglG3zpZ3YlA-1 X-Mimecast-MFC-AGG-ID: 5YW4fO_OOduglG3zpZ3YlA Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id DE5801954AFF; Mon, 2 Dec 2024 14:08:11 +0000 (UTC) Received: from gmonaco-thinkpadt14gen3.rmtit.com (unknown [10.39.192.39]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 42F93195605A; Mon, 2 Dec 2024 14:08:06 +0000 (UTC) From: Gabriele Monaco To: Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , linux-kernel@vger.kernel.org Cc: Gabriele Monaco Subject: [PATCH 2/2] sched: Move task_mm_cid_work to RCU callback Date: Mon, 2 Dec 2024 15:07:35 +0100 Message-ID: <20241202140735.56368-3-gmonaco@redhat.com> In-Reply-To: <20241202140735.56368-1-gmonaco@redhat.com> References: <20241202140735.56368-1-gmonaco@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 Content-Type: text/plain; charset="utf-8" Currently, the task_mm_cid_work function is called in a task work triggered by a scheduler tick. This can delay the execution of the task for the entire duration of the function. This patch runs the task_mm_cid_work in the RCU callback thread rather than in the task context before returning to userspace. The main advantage of this change is that the function can be offloaded to a different CPU and even preempted by RT tasks. On a busy system, this may mean the function gets called less often, but the current behaviour already doesn't provide guarantees. Signed-off-by: Gabriele Monaco --- include/linux/sched.h | 1 - kernel/sched/core.c | 17 ++++++----------- 2 files changed, 6 insertions(+), 12 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index d380bffee2ef..5d141c310917 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1374,7 +1374,6 @@ struct task_struct { int last_mm_cid; /* Most recent cid in mm */ int migrate_from_cpu; int mm_cid_active; /* Whether cid bitmap is active */ - struct callback_head cid_work; #endif =20 struct tlbflush_unmap_batch tlb_ubc; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 57b50b5952fa..0fc1a972fd4f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10520,17 +10520,15 @@ static void sched_mm_cid_remote_clear_weight(stru= ct mm_struct *mm, int cpu, sched_mm_cid_remote_clear(mm, pcpu_cid, cpu); } =20 -static void task_mm_cid_work(struct callback_head *work) +static void task_mm_cid_work(struct rcu_head *rhp) { unsigned long now =3D jiffies, old_scan, next_scan; - struct task_struct *t =3D current; + struct task_struct *t =3D container_of(rhp, struct task_struct, rcu); struct cpumask *cidmask; struct mm_struct *mm; int weight, cpu; =20 - SCHED_WARN_ON(t !=3D container_of(work, struct task_struct, cid_work)); - - work->next =3D work; /* Prevent double-add */ + rhp->next =3D rhp; /* Prevent double-add */ if (t->flags & PF_EXITING) return; mm =3D t->mm; @@ -10574,23 +10572,20 @@ void init_sched_mm_cid(struct task_struct *t) if (mm_users =3D=3D 1) mm->mm_cid_next_scan =3D jiffies + msecs_to_jiffies(MM_CID_SCAN_DELAY); } - t->cid_work.next =3D &t->cid_work; /* Protect against double add */ - init_task_work(&t->cid_work, task_mm_cid_work); } =20 void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { - struct callback_head *work =3D &curr->cid_work; + struct rcu_head *rhp =3D &curr->rcu; unsigned long now =3D jiffies; =20 if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) || - work->next !=3D work) + rhp->next !=3D rhp) return; if (time_before(now, READ_ONCE(curr->mm->mm_cid_next_scan))) return; =20 - /* No page allocation under rq lock */ - task_work_add(curr, work, TWA_RESUME | TWAF_NO_ALLOC); + call_rcu(rhp, task_mm_cid_work); } =20 void sched_mm_cid_exit_signals(struct task_struct *t) --=20 2.47.0