From nobody Wed Dec 17 12:47:58 2025 Received: from smtpout.efficios.com (smtpout.efficios.com [158.69.130.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 37FF123E6DA for ; Thu, 12 Dec 2024 15:58:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=158.69.130.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734019118; cv=none; b=pxFS8RbDnxrDSWchWsQcuvGQCPWQIeMWsl24L7Yku1olhuSxR3ILpNXrENv0AlfjxkHsFiWpjApfg6SOg7fa3DmPTeZxlyHU4R4827X6pKCXaiHwJNYk2NHePXZ2a178zc7li5yuM+MGIcY8pJcgp+Bdo0h3ktpn2bJFOWzJ9Po= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734019118; c=relaxed/simple; bh=zBd7T5NIB7hPfbFNSDI7driw6ROkkz34gb/RY0wBpY8=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=Y2tqHJMwxpa9TPzDCJAPre/46BF3t4kUEVmZmpcaQ1gICWm0QHlmI+EIhDomrPGUEcPX8KB/6VdzFbpjgBOROQdh4jHuWI7epoFwuHWg/EjpQcHpyEXcbRjCnDk3R2vbQxahidoTpE+bFupqAFL0Sw3kRBLiqC5Y0bdolAgAKgQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=Tamm4X9R; arc=none smtp.client-ip=158.69.130.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="Tamm4X9R" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1734018587; bh=zBd7T5NIB7hPfbFNSDI7driw6ROkkz34gb/RY0wBpY8=; h=From:To:Cc:Subject:Date:From; b=Tamm4X9R/C9DxvSVzG43R5KeZIwLyhKonLyjBSrZ0uoNGQZigr3Hz/qmCB0uIruB7 c9fxHG5ZISmQlLQlQhK5xkRDJw7Hk1egmRxDSjD+q5UM2eY4a6APE4+nlMQskfB55i hAa8Y6SMmmS8O08wyEFdh7OrfpEwE9UKeCDzrGOnil4IQ5TB1BLe1NMGLzpnz/fKx6 kxVPLQ5+AJiTGYoDuzo9cRpiFoUZYyiKx3eASTDRtr9I/vhIJgidbOS3Wk3/dePSJf fFGy18dAjpfi6dpkp/iXpupH+tNTwl4nCpwH/s77j1r/5s0+erSPrFxuAY8HYcfIjI jWT6Pv4e1xriQ== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4Y8H3H4VXczVVJ; Thu, 12 Dec 2024 10:49:47 -0500 (EST) From: Mathieu Desnoyers To: Gabriele Monaco Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Marco Elver , Ingo Molnar Subject: [PATCH] sched: Compact RSEQ concurrency IDs with reduced threads and affinity Date: Thu, 12 Dec 2024 10:49:43 -0500 Message-Id: <20241212154943.148632-1-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.5 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When a process reduces its number of threads or clears bits in its CPU affinity mask, the mm_cid allocation should eventually converge towards smaller values. However, the change introduced by: commit 7e019dcc470f "sched: Improve cache locality of RSEQ concurrency IDs for intermittent workloads" adds a per-mm/CPU recent_cid which is never unset unless a thread migrates. This is a tradeoff between: A) Preserving cache locality after a transition from many threads to few threads, or after reducing the hamming weight of the allowed CPU mask. B) Making the mm_cid upper bounds wrt nr threads and allowed CPU mask easy to document and understand. C) Allowing applications to eventually react to mm_cid compaction after reduction of the nr threads or allowed CPU mask, making the tracking of mm_cid compaction easier by shrinking it back towards 0 or not. D) Making sure applications that periodically reduce and then increase again the nr threads or allowed CPU mask still benefit from good cache locality with mm_cid. Introduce the following changes: * After shrinking the number of threads or reducing the number of allowed CPUs, reduce the value of max_nr_cid so expansion of CID allocation will preserve cache locality if the number of threads or allowed CPUs increase again. * Only re-use a recent_cid if it is within the max_nr_cid upper bound, else find the first available CID. Fixes: 7e019dcc470f "sched: Improve cache locality of RSEQ concurrency IDs = for intermittent workloads" Cc: Peter Zijlstra (Intel) Cc: Marco Elver Cc: Ingo Molnar Cc: Gabriele Monaco Signed-off-by: Mathieu Desnoyers --- include/linux/mm_types.h | 7 ++++--- kernel/sched/sched.h | 24 +++++++++++++++++++++--- 2 files changed, 25 insertions(+), 6 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7361a8f3ab68..d56948a74254 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -843,10 +843,11 @@ struct mm_struct { */ unsigned int nr_cpus_allowed; /** - * @max_nr_cid: Maximum number of concurrency IDs allocated. + * @max_nr_cid: Maximum number of allowed concurrency + * IDs allocated. * - * Track the highest number of concurrency IDs allocated for the - * mm. + * Track the highest number of allowed concurrency IDs + * allocated for the mm. */ atomic_t max_nr_cid; /** diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 76f5f53a645f..7df01dc796dc 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3657,10 +3657,27 @@ static inline int __mm_cid_try_get(struct task_stru= ct *t, struct mm_struct *mm) { struct cpumask *cidmask =3D mm_cidmask(mm); struct mm_cid __percpu *pcpu_cid =3D mm->pcpu_cid; - int cid =3D __this_cpu_read(pcpu_cid->recent_cid); + int cid, max_nr_cid, allowed_max_nr_cid; =20 + /* + * After shrinking the number of threads or reducing the number + * of allowed cpus, reduce the value of max_nr_cid so expansion + * of cid allocation will preserve cache locality if the number + * of threads or allowed cpus increase again. + */ + max_nr_cid =3D atomic_read(&mm->max_nr_cid); + while ((allowed_max_nr_cid =3D min_t(int, READ_ONCE(mm->nr_cpus_allowed),= atomic_read(&mm->mm_users))), + max_nr_cid > allowed_max_nr_cid) { + /* atomic_try_cmpxchg loads previous mm->max_nr_cid into max_nr_cid. */ + if (atomic_try_cmpxchg(&mm->max_nr_cid, &max_nr_cid, allowed_max_nr_cid)= ) { + max_nr_cid =3D allowed_max_nr_cid; + break; + } + } /* Try to re-use recent cid. This improves cache locality. */ - if (!mm_cid_is_unset(cid) && !cpumask_test_and_set_cpu(cid, cidmask)) + cid =3D __this_cpu_read(pcpu_cid->recent_cid); + if (!mm_cid_is_unset(cid) && cid < max_nr_cid && + !cpumask_test_and_set_cpu(cid, cidmask)) return cid; /* * Expand cid allocation if the maximum number of concurrency @@ -3668,8 +3685,9 @@ static inline int __mm_cid_try_get(struct task_struct= *t, struct mm_struct *mm) * and number of threads. Expanding cid allocation as much as * possible improves cache locality. */ - cid =3D atomic_read(&mm->max_nr_cid); + cid =3D max_nr_cid; while (cid < READ_ONCE(mm->nr_cpus_allowed) && cid < atomic_read(&mm->mm_= users)) { + /* atomic_try_cmpxchg loads previous mm->max_nr_cid into cid. */ if (!atomic_try_cmpxchg(&mm->max_nr_cid, &cid, cid + 1)) continue; if (!cpumask_test_and_set_cpu(cid, cidmask)) --=20 2.39.5