From nobody Tue May 26 08:52:39 2026 Received: from outbound.ci.icloud.com (ci-2006f-snip4-3.eps.apple.com [57.103.90.184]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F35673D16FD for ; Tue, 12 May 2026 06:20:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=57.103.90.184 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566838; cv=none; b=QjrN0ob0zI5zxQlL4AwwUMc/cXUoAiQ5f/CadiSgiSLbZdxec1EIxwdkwFdqztE+8N2+OFvQy/Nmr8LBXoeTkGgT9c491HG1tiicKumnUAh1K8yDvFY2Re8L5fgndmUTxVMCvS1QasACu/remKjtqB9jGTaTuno6WypbFztE1eA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566838; c=relaxed/simple; bh=nfyztFQ9boxz2/uQMy43oZj2XTS6W3sky0rgpJYCHFI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=gGc+DT7TBh/J3fsgkP+ScqorKLyQnr4u3jNjz+R7KDGFOoLdG1xe+ZPUVyIYzWxiHTdaJAApscA1VPJl1nMYwz6gYgXhHUIuKVtpdO99Minjdnw+soaq5J5DX74jXNj4BtUntcH1ai/3Yz9bvdw70EDJ6K7eyTxWsa7lY/5Y3ik= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com; spf=pass smtp.mailfrom=icloud.com; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b=HfaoTKTf; arc=none smtp.client-ip=57.103.90.184 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=icloud.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b="HfaoTKTf" Received: from outbound.ci.icloud.com (unknown [127.0.0.2]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPS id 92D92180009B; Tue, 12 May 2026 06:20:19 +0000 (UTC) X-ICL-Out-Info: HUtFAUMEWwJACUgBTUQeDx5WFlZNRAJCTQhJB0MFXwReC0sKQw5eEhVdRV8YXApUH1oNQC1eCF4fTBwdDlgGEhZdRV8YXApUH1oNQC1eCF4fTBwdDlgGEgJaRQFbFwNXHFZFXBhDCV0FVxwdDl5FWxNVF0YJGQhdHRkIRx8KMANCDlYDQwdFAC0ZHFdQXgheH0wcHQ5YBhIdUBwOUQVbAEYJTQJfGhtBGWYRXh1FRkRBFEkeX1VcVEEJHlcLVg8HME0dXQ5SBUZeWhdeUxcfSwBcRVoOWwRHFA== Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=1a1hai; t=1778566822; x=1781158822; bh=RUEYIksx/MlqHH0kLnIe3V1vih+q7HfXEtcMMIeWBLQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:x-icloud-hme; b=HfaoTKTf4WS//AWsDfHkLLUWt6/WCC8m/JNJc0O5DPd0B0E6YvGHkXfHZngCaIJL/MIGbo39qbHgzCO0byXmEMeHM1Z5Zl2mASItHJvvwhIvFWFVv07dfV3rfyMGj+X+p0YrIu3UDpRrzncSKWDPbEqbEWCQXoy0pJIGHfsTrcRzizMt+TfeIVPaTpMRhU4Q13ax8UKToX2cQkMlT7b6fjPOLpGs7Unlu9xQs/xmygN02G6gEpxoglCc6OLueLm9/E7h/UDJm8tbw2UPPehLRrU/f81Np8xaSlohHFu/E2xJ0TJo32TB/ao3HrSvlFUPjBmPmGzXypaTH28v235JCA== Received: from [127.0.0.1] (unknown [17.57.156.36]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPSA id 37E531800145; Tue, 12 May 2026 06:20:10 +0000 (UTC) From: Luka Bai Date: Tue, 12 May 2026 14:19:57 +0800 Subject: [PATCH 1/6] psi: move curr_in_memstall out of psi_group_change Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260512-psi_impr-v1-1-2b7f10fdfad5@tencent.com> References: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> In-Reply-To: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> To: linux-mm@kvack.org Cc: Johannes Weiner , Suren Baghdasaryan , Peter Ziljstra , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Michal Hocko , Kees Cook , Tejun Heo , =?utf-8?q?Michal_Koutn=C3=BD?= , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Luka Bai X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778566802; l=4239; i=lukabai@tencent.com; s=20260501; h=from:subject:message-id; bh=N4v4fH+hkrlYsmcPnl7s6h0mPozXQxvFn8vL46xw/7k=; b=+Cqj7ZBgNI/G1fXPJkGX4nB/hX2z3e7OsHKJnqqjWDms6s/Img1NG6X3IFhvCChxfTwXzVw7g cttC2Nf+welDlxy4aJqInL3IKbZeDZQzvz0cA1lGqA9n9dHV8YQJ5J7 X-Developer-Key: i=lukabai@tencent.com; a=ed25519; pk=KeaVteSWd00GIAjFyWZnuFsKAKixjga1ZkLMcI66nPM= X-Proofpoint-GUID: Q22Xmb8so1l12T3yi4ezHpMO8PEOt5Rc X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTEyMDA2MCBTYWx0ZWRfX0TXVU+7TkLnZ QcvYmzqmWJbm61P+lopHzA418qFOuOshWgjJYF5QpxPzVU6TYZMLE1ygDjtDOR2QBZ9RhhCsStT BCqZVnBxiA75onTrihHRg8QgxmDcVeRJ5HbQBHMykUiigsyuwLKOxrbyMJsaHsrQvxPQZE5YRcp +G2FS2MAy+YLY9vzEtJA0jtbFzdRIe3oePpQRegYh8Kp4zvXFIgQX0MIgxKVM78UjvdYrU8RgIB C/ko+TQrAZUYNkf0Si90laaVmToUH1ikjlkzUYrF6JsaohyeOpJChQx/4n/Ty1EI5qTReFYHW7y Ffmuf7JBYnSn9UCPcMLyXKBzEhGXGJKiItGUx5goBpPE3SJ7vp1GcsV7518p+0= X-Proofpoint-ORIG-GUID: Q22Xmb8so1l12T3yi4ezHpMO8PEOt5Rc X-Authority-Info-Out: v=2.4 cv=f8FFxeyM c=1 sm=1 tr=0 ts=6a02c6a5 cx=c_apl:c_pps:t_out a=2G65uMN5HjSv0sBfM2Yj2w==:117 a=2G65uMN5HjSv0sBfM2Yj2w==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=UaoJkeuwEpQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=GvQkQWPkAAAA:8 a=geQMgLTJ2Pn_Frh0vGQA:9 a=QEXdDO2ut3YA:10 From: Luka Bai Variable curr_in_memstall is currently judged by accessing the in_memstall of cpu_curr(cpu), which contains multiple times of memory accessing. And it is now located in psi_group_change() that will be called for each parent cgroup and it is redundant sometimes since its value will not change for all these parent cgroups. So we move the variable outside for two reasons: 1. We save the extra calling for each parent cgroup so we avoid these possible uncessary cacheline stall. 2. For function like psi_task_switch, we don't need to call the cpu_curr(cpu) to get the task that is currently running in the cpu runqueue. Under that context, "next" is absolutely the running task so we can save some costly calling. Signed-off-by: Luka Bai --- kernel/sched/psi.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index d9c9d9480a45..27097cb0dc79 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -795,7 +795,7 @@ static void record_times(struct psi_group_cpu *groupc, = u64 now) =20 static void psi_group_change(struct psi_group *group, int cpu, unsigned int clear, unsigned int set, - u64 now, bool wake_clock) + u64 now, bool wake_clock, bool curr_in_memstall) { struct psi_group_cpu *groupc; unsigned int t, m; @@ -868,7 +868,7 @@ static void psi_group_change(struct psi_group *group, i= nt cpu, * task in a cgroup is in_memstall, the corresponding groupc * on that cpu is in PSI_MEM_FULL state. */ - if (unlikely((state_mask & PSI_ONCPU) && cpu_curr(cpu)->in_memstall)) + if (unlikely((state_mask & PSI_ONCPU) && curr_in_memstall)) state_mask |=3D (1 << PSI_MEM_FULL); =20 record_times(groupc, now); @@ -910,6 +910,7 @@ void psi_task_change(struct task_struct *task, int clea= r, int set) { int cpu =3D task_cpu(task); u64 now; + bool curr_in_memstall; =20 if (!task->pid) return; @@ -917,9 +918,11 @@ void psi_task_change(struct task_struct *task, int cle= ar, int set) psi_flags_change(task, clear, set); =20 psi_write_begin(cpu); + curr_in_memstall =3D cpu_curr(cpu)->in_memstall; now =3D cpu_clock(cpu); for_each_group(group, task_psi_group(task)) - psi_group_change(group, cpu, clear, set, now, true); + psi_group_change(group, cpu, clear, set, now, true, + curr_in_memstall); psi_write_end(cpu); } =20 @@ -929,11 +932,13 @@ void psi_task_switch(struct task_struct *prev, struct= task_struct *next, struct psi_group *common =3D NULL; int cpu =3D task_cpu(prev); u64 now; + bool curr_in_memstall =3D false; =20 psi_write_begin(cpu); now =3D cpu_clock(cpu); =20 if (next->pid) { + curr_in_memstall =3D next->in_memstall; psi_flags_change(next, 0, TSK_ONCPU); /* * Set TSK_ONCPU on @next's cgroups. If @next shares any @@ -947,7 +952,8 @@ void psi_task_switch(struct task_struct *prev, struct t= ask_struct *next, common =3D group; break; } - psi_group_change(group, cpu, 0, TSK_ONCPU, now, true); + psi_group_change(group, cpu, 0, TSK_ONCPU, now, true, + curr_in_memstall); } } =20 @@ -984,7 +990,8 @@ void psi_task_switch(struct task_struct *prev, struct t= ask_struct *next, for_each_group(group, task_psi_group(prev)) { if (group =3D=3D common) break; - psi_group_change(group, cpu, clear, set, now, wake_clock); + psi_group_change(group, cpu, clear, set, now, wake_clock, + curr_in_memstall); } =20 /* @@ -996,7 +1003,8 @@ void psi_task_switch(struct task_struct *prev, struct = task_struct *next, if ((prev->psi_flags ^ next->psi_flags) & ~TSK_ONCPU) { clear &=3D ~TSK_ONCPU; for_each_group(group, common) - psi_group_change(group, cpu, clear, set, now, wake_clock); + psi_group_change(group, cpu, clear, set, now, wake_clock, + curr_in_memstall); } } psi_write_end(cpu); @@ -1236,7 +1244,8 @@ void psi_cgroup_restart(struct psi_group *group) =20 psi_write_begin(cpu); now =3D cpu_clock(cpu); - psi_group_change(group, cpu, 0, 0, now, true); + psi_group_change(group, cpu, 0, 0, now, true, + cpu_curr(cpu)->in_memstall); psi_write_end(cpu); } } --=20 2.52.0 From nobody Tue May 26 08:52:39 2026 Received: from outbound.ci.icloud.com (ci-2006a-snip4-6.eps.apple.com [57.103.90.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 053F845BD7B for ; Tue, 12 May 2026 06:20:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=57.103.90.137 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566848; cv=none; b=POUtnmSnZIm/CKclkIZOYT69FtCpWc/OqW/N+Ffm7qkSVXyBkiGHw/8W7om9R7BAa6KgvMbo/jY0a+qB0wWzTf7K3qr+nIGY7dAhA8+jTRNRYItmQUA2ZVsW1w5cto1mbQpGiJ46xMe9vk59aqgKWKmDHSL+fv/hAYPR7Up02CY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566848; c=relaxed/simple; bh=HpA0+ADhSeNav611ha+axsQYGMsF8wXTnM7S36tVEBs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=btT3sHoMQQCIL6ODZZ9PP2tuwW0FCxEuRZcHihsQF1QhLiAk/KNxFRVe8hwq7m5+e3HUloGWoeNLm2nkucNA7tyuidn7OjqtYzd3I/yqNRvDd8w/I1HhN7LvE6KKZeKr6bO9DkScEVI+3QdaPpqm8xIxORj+8LtsJzBVWbbcX0A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com; spf=pass smtp.mailfrom=icloud.com; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b=frWUwbFM; arc=none smtp.client-ip=57.103.90.137 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=icloud.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b="frWUwbFM" Received: from outbound.ci.icloud.com (unknown [127.0.0.2]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPS id 64E981800162; Tue, 12 May 2026 06:20:27 +0000 (UTC) X-ICL-Out-Info: HUtFAUMEWwJACUgBTUQeDx5WFlZNRAJCTQhJB0MFXwReC0sKQw5eEhVdRV8YXApUH1oNQC1eCF4fTBwdDlgGEhZdRV8YXApUH1oNQC1eCF4fTBwdDlgGEgJaRQFbFwNXHFZFXBhDCV0FVxwdDl5FWxNVF0YJGQhdHRkIRx8KMANCDlYDQwdFAC0ZHFdQXgheH0wcHQ5YBhIdUBwOUQVbAEYJTQJfGhtBGWYRXh1FRkRBFEoeX1VcVEEJHlcLVg8HME0dXQ5SBUZeWhdeUxcfSwBcRVoOWwRHFA== Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=1a1hai; t=1778566830; x=1781158830; bh=8h+7o8Bgk3l22j2rwVIvN8mIH1kXWOUMroNDLXzMIms=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:x-icloud-hme; b=frWUwbFMzDRZTuvIbU+SbHHZTU0jXu8FdTinTBy3KtVQZXX37b1wxAIeIbFyx9AqFfsVO9Dti9XClU1aYkbUPZ6zcQO1IPoSyQCx4P8jnungNVWUw2KeRZzk6VN9NDGAe9Id+vbuubQ6I2gEFBo4ExmqZS/ACu0Q4NyPSnkMzo85hlSB5lPCYzfStd8zLSGUkDUP2olbRHb/F/b5g/VasMM+Ha0zwB9f9JASk+qchBq6jCOzT0GCTWOfOIvY9vgLTWF85hb5jfoaE2pFagmb5FfDPm4sUV6thxBzD8o0haWaeihiXDR2UwPFtA3yTzPXKrW8P8+53rWze4OtVvhoyg== Received: from [127.0.0.1] (unknown [17.57.156.36]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPSA id B5847180010C; Tue, 12 May 2026 06:20:19 +0000 (UTC) From: Luka Bai Date: Tue, 12 May 2026 14:19:58 +0800 Subject: [PATCH 2/6] psi: reorganize the psi members for cacheline benifits Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260512-psi_impr-v1-2-2b7f10fdfad5@tencent.com> References: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> In-Reply-To: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> To: linux-mm@kvack.org Cc: Johannes Weiner , Suren Baghdasaryan , Peter Ziljstra , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Michal Hocko , Kees Cook , Tejun Heo , =?utf-8?q?Michal_Koutn=C3=BD?= , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Luka Bai X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778566802; l=6142; i=lukabai@tencent.com; s=20260501; h=from:subject:message-id; bh=N5D+RIwNZxJiJH2AZyKghi8aFJucRQZRH7/vB1IusO0=; b=7t1U+9WAPOc6xZ2Jf+KkbhsNm38o93/Pyv5oy5BDThvnGJnPJVbvPDZyK0wSKSmmYzUWwbOhg XBdum8KdBkhCSyqO8oqxS6ghmUn1ubG4LxbvrSMr0zEh1mCCBvpK3bs X-Developer-Key: i=lukabai@tencent.com; a=ed25519; pk=KeaVteSWd00GIAjFyWZnuFsKAKixjga1ZkLMcI66nPM= X-Proofpoint-ORIG-GUID: 2M5e_Q70fx062IKZSDQxewmhsUgwKgQo X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTEyMDA2MCBTYWx0ZWRfX6QSzkCjU11Mi 51TXnSvJ/Is9C+tRSgrDz4C3LzwMi12qxThHTNNN3fS+UPt5chTyWEo3OvvEhLTIc6D0w1lsBj9 rhazZSLaMZn8oAfQY9vh2Is36O3mW9+/R8E4PK+fuXHSSHZBOfwR7XNXk9dlbzHrznciRrp34uB swU3Ffiu6HnBj3hFIwhGymIq3MFGg8ABBcQvwSjSOFtTATM7ormp0ykSXl1hkmVUQY+ZoPWydSt CMlNEiNASWdeAt8n2YnmYdReWrsE9KzW8TuOBF3uoDwSBMq1SY+3OnDqRaBZ2Q8qTtxT4KmHcgT dLPh8sfw5e9Dm7dzR8hbz2nWZJ5Cc65ovsY7Gqv6eIlxM3ScxNt3coWFgUGYDs= X-Proofpoint-GUID: 2M5e_Q70fx062IKZSDQxewmhsUgwKgQo X-Authority-Info-Out: v=2.4 cv=fMA0HJae c=1 sm=1 tr=0 ts=6a02c6ac cx=c_apl:c_pps:t_out a=2G65uMN5HjSv0sBfM2Yj2w==:117 a=2G65uMN5HjSv0sBfM2Yj2w==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=UaoJkeuwEpQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=GvQkQWPkAAAA:8 a=cUUTUCbMLiH3-CSjQHoA:9 a=QEXdDO2ut3YA:10 From: Luka Bai Currently, we check whether the task needs to do psi accounting by reading task->pid, which is not cacheline aligned with other psi variables like in_memstall. This can generate some cacheline stall from what perf-record indicates. So we would like to merge these variables together. However, directly switching order of pid and restart_block may cause other cacheline problem in other scenorios which is hard to recognize clearly. So we added need_psi bitfield variable to indicate the same psi thing and put it together with in_memstall. The value of need_psi will not be changed ever since the task gets created so there is no problem about synchronization. Also, adding one bit to the bitfield variable of unsigned int will not enlarge the size of task_struct or change the memory pattern of task_struct at all. Also, we put psi_flags which only has 5 bits long together with in_memstall and need_psi too to make them all cacheline optimized. 5 extra bits can also be stuffed into one single unsigned int so it will also not enlarge the size of task_struct, but on the contrary, it will shrink the task_struct since we eliminate the psi_flags that was put there independently as a unsigned int. We also add NR_TSK_ONCPU and NR_PSI_ALL_COUNTS into the psi_task_count enum definition to make the semantics clearer, and move the definition from linux/psi_types.h into linux/sched.h since we need those enums in linux/sched.h. These two revisions do not make any actual funtional difference to the code. Signed-off-by: Luka Bai --- include/linux/psi_types.h | 20 +------------------- include/linux/sched.h | 29 +++++++++++++++++++++++++---- kernel/fork.c | 10 ++++++++++ kernel/sched/psi.c | 6 +++--- 4 files changed, 39 insertions(+), 26 deletions(-) diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h index dd10c22299ab..5639dcdd90af 100644 --- a/include/linux/psi_types.h +++ b/include/linux/psi_types.h @@ -10,24 +10,6 @@ =20 #ifdef CONFIG_PSI =20 -/* Tracked task states */ -enum psi_task_count { - NR_IOWAIT, - NR_MEMSTALL, - NR_RUNNING, - /* - * For IO and CPU stalls the presence of running/oncpu tasks - * in the domain means a partial rather than a full stall. - * For memory it's not so simple because of page reclaimers: - * they are running/oncpu while representing a stall. To tell - * whether a domain has productivity left or not, we need to - * distinguish between regular running (i.e. productive) - * threads and memstall ones. - */ - NR_MEMSTALL_RUNNING, - NR_PSI_TASK_COUNTS =3D 4, -}; - /* Task state bitmasks */ #define TSK_IOWAIT (1 << NR_IOWAIT) #define TSK_MEMSTALL (1 << NR_MEMSTALL) @@ -35,7 +17,7 @@ enum psi_task_count { #define TSK_MEMSTALL_RUNNING (1 << NR_MEMSTALL_RUNNING) =20 /* Only one task can be scheduled, no corresponding task count */ -#define TSK_ONCPU (1 << NR_PSI_TASK_COUNTS) +#define TSK_ONCPU (1 << NR_TSK_ONCPU) =20 /* Resources that workloads could be stalled on */ enum psi_res { diff --git a/include/linux/sched.h b/include/linux/sched.h index 368c7b4d7cb5..34d7f80531e7 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -817,6 +817,28 @@ struct kmap_ctrl { #endif }; =20 +#ifdef CONFIG_PSI +/* Tracked task states */ +enum psi_task_count { + NR_IOWAIT, + NR_MEMSTALL, + NR_RUNNING, + /* + * For IO and CPU stalls the presence of running/oncpu tasks + * in the domain means a partial rather than a full stall. + * For memory it's not so simple because of page reclaimers: + * they are running/oncpu while representing a stall. To tell + * whether a domain has productivity left or not, we need to + * distinguish between regular running (i.e. productive) + * threads and memstall ones. + */ + NR_MEMSTALL_RUNNING, + NR_PSI_TASK_COUNTS, + NR_TSK_ONCPU =3D NR_PSI_TASK_COUNTS, + NR_PSI_ALL_COUNTS, +}; +#endif + struct task_struct { #ifdef CONFIG_THREAD_INFO_IN_TASK /* @@ -1030,6 +1052,9 @@ struct task_struct { #ifdef CONFIG_PSI /* Stalled due to lack of memory */ unsigned in_memstall:1; + unsigned need_psi:1; + /* Pressure stall state */ + unsigned psi_flags:NR_PSI_ALL_COUNTS; #endif #ifdef CONFIG_PAGE_OWNER /* Used by page_owner=3Don to detect recursion in page tracking. */ @@ -1299,10 +1324,6 @@ struct task_struct { kernel_siginfo_t *last_siginfo; =20 struct task_io_accounting ioac; -#ifdef CONFIG_PSI - /* Pressure stall state */ - unsigned int psi_flags; -#endif #ifdef CONFIG_TASK_XACCT /* Accumulated RSS usage: */ u64 acct_rss_mem1; diff --git a/kernel/fork.c b/kernel/fork.c index 0d97fd71d7f6..20b47c876b27 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2177,6 +2177,16 @@ __latent_entropy struct task_struct *copy_process( =20 #ifdef CONFIG_PSI p->psi_flags =3D 0; + /* + * Only setup need_psi to 1 for non-idle tasks. We + * also need to reset need_psi of idle tasks to 0 since + * their values are copied from the init task whose + * need_psi is not 0. + */ + if (pid !=3D &init_struct_pid) + p->need_psi =3D 1; + else + p->need_psi =3D 0; #endif =20 task_io_accounting_init(&p->ioac); diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 27097cb0dc79..7374c05a5751 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -912,7 +912,7 @@ void psi_task_change(struct task_struct *task, int clea= r, int set) u64 now; bool curr_in_memstall; =20 - if (!task->pid) + if (!task->need_psi) return; =20 psi_flags_change(task, clear, set); @@ -937,7 +937,7 @@ void psi_task_switch(struct task_struct *prev, struct t= ask_struct *next, psi_write_begin(cpu); now =3D cpu_clock(cpu); =20 - if (next->pid) { + if (next->need_psi) { curr_in_memstall =3D next->in_memstall; psi_flags_change(next, 0, TSK_ONCPU); /* @@ -957,7 +957,7 @@ void psi_task_switch(struct task_struct *prev, struct t= ask_struct *next, } } =20 - if (prev->pid) { + if (prev->need_psi) { int clear =3D TSK_ONCPU, set =3D 0; bool wake_clock =3D true; =20 --=20 2.52.0 From nobody Tue May 26 08:52:39 2026 Received: from outbound.ci.icloud.com (ci-2006j-snip4-11.eps.apple.com [57.103.90.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F03336897B for ; Tue, 12 May 2026 06:20:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=57.103.90.231 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566860; cv=none; b=TVIMwAQ42MWHirwCsDF+lxXa1htzqYu3UMoQy+ONwicVGrpPfP1lbFZgl/+1Ytdhfl/cuViIkEXGVrIF5hRFcJUtE2IIRrCsGTyURazG7jk4fzeZ5+cxsLibsOn6nzP59gGX/3qQOol5/BdOa+Dx/1uLB1nXWLMtwFOnRRuOG5A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566860; c=relaxed/simple; bh=zqVzAXMYu63ffGko9nKPICpveunF3RAWx1YTFHVCO6U=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=neLLvygVodbtKSRYV28Ql1Zk8ODRv3GwPg42ua5q0Euitz+tpxAt1Tn0bCIsvBab4Oc+9MlfwghxoN7IN4WnSMUz46U9lc5HLnndGGyln2lfKdXBt1NbFDe/I5uIXMW+lKVONJGRbe/ijUPN6zd5SrUbzt3jfrzeyqBVZMHwS4U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com; spf=pass smtp.mailfrom=icloud.com; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b=nqSKptPk; arc=none smtp.client-ip=57.103.90.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=icloud.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b="nqSKptPk" Received: from outbound.ci.icloud.com (unknown [127.0.0.2]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPS id C971E1800145; Tue, 12 May 2026 06:20:35 +0000 (UTC) X-ICL-Out-Info: HUtFAUMEWwJACUgBTUQeDx5WFlZNRAJCTQhJB0MFXwReC0sKQw5eEhVdRV8YXApUH1oNQC1eCF4fTBwdDlgGEhZdRV8YXApUH1oNQC1eCF4fTBwdDlgGEgJaRQFbFwNXHFZFXBhDCV0FVxwdDl5FWxNVF0YJGQhdHRkIRx8KMANCDlYDQwdFAC0ZHFdQXgheH0wcHQ5YBhIdUBwOUQVbAEYJTQJfGhtBGWYRXh1FRkRBFEseX1VcVEEJHlcLVg8HME0dXQ5SBUZeWhdeUxcfSwBcRVoOWwRHFA== Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=1a1hai; t=1778566840; x=1781158840; bh=jBrMrf8hMktELWdbsosIy/gP1NfZvzoho6CS5ELcmAA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:x-icloud-hme; b=nqSKptPkZhXlFKhIXZUyEIegTK+C2OsHiT/+GTkHpNcv5W6BCkv+iQ/zGkQOWt4RCZVJk+Orx4uETUusdtVi7IPGfvVcB9hEgH/O4P5GeaqfavSAN7Ze5/FGJMccDYXDBHP1C5t5muzPLcu5UC93BhZRgTrbfHBLJ3gMN4rKIMZgvJM8aU6RvDeLqC8xs9zl0l+Kvyu6+SuXmkplw0Ys3BmXcDxshqffezgjiZteKVpilLJ+wmJqJwcFqwCChOcwoFLKJXmDdCM1HkKxV/aEPrf3TJSt6Ee1Xhn3knKZz4NQrtG3PXi7mhvnIHhlG3lDMfQlfCwKjO/2ywYIneOJRA== Received: from [127.0.0.1] (unknown [17.57.156.36]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPSA id 823941800163; Tue, 12 May 2026 06:20:27 +0000 (UTC) From: Luka Bai Date: Tue, 12 May 2026 14:19:59 +0800 Subject: [PATCH 3/6] psi: use prefetch to preread the parent groupc Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260512-psi_impr-v1-3-2b7f10fdfad5@tencent.com> References: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> In-Reply-To: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> To: linux-mm@kvack.org Cc: Johannes Weiner , Suren Baghdasaryan , Peter Ziljstra , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Michal Hocko , Kees Cook , Tejun Heo , =?utf-8?q?Michal_Koutn=C3=BD?= , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Luka Bai X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778566802; l=1563; i=lukabai@tencent.com; s=20260501; h=from:subject:message-id; bh=+JcKsdtCurE9VHRwxuJWG/dD3Hek1DjyyYdhbiDrT/g=; b=1kCJZGZ3wESDhZA3qYvhWlcaaEd8GY1Dqwzu5eDcX3sjMZmjp9vbuShOztlWnK6SiqPpyK+ox BE2osQ8fjgaAhGJcOhyngFdGGntxSRkArR1ugoO+3+/29y75WgKbhLy X-Developer-Key: i=lukabai@tencent.com; a=ed25519; pk=KeaVteSWd00GIAjFyWZnuFsKAKixjga1ZkLMcI66nPM= X-Proofpoint-ORIG-GUID: TAXgD_ERG73VAJ-rZk6gIBJZTM7Spttj X-Authority-Info-Out: v=2.4 cv=U6mfzOru c=1 sm=1 tr=0 ts=6a02c6b6 cx=c_apl:c_pps:t_out a=2G65uMN5HjSv0sBfM2Yj2w==:117 a=2G65uMN5HjSv0sBfM2Yj2w==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=UaoJkeuwEpQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=GvQkQWPkAAAA:8 a=Pzb0aO4xvM8uoPzCJ9cA:9 a=QEXdDO2ut3YA:10 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTEyMDA2MCBTYWx0ZWRfXxa7iA0pwiHf6 cecYf2NWLuZ4bTxf3ydbN4Nes77CXyip6yklmRkmDJhHA3x+Vmgkq/VJJ//pbUq8WPwfz3VFsxn grYkb+mM9ebDv7n/jMfYfrWD6+YCbBNJlvPUOOmqlC3gKnvA0cPJKYFnj5/4Uj8eitZhlCIlzrb YlNoksKPiVZJgT/hYCD0iKpcpoTvMj4yfUJjDYQx5fwD/W8OYm++M9FhaVcvPp1DYT600vsjCAB jqaZr2Wa+npjxAO4yI6NRFfO/WHehzNOe3wDC6JX2YVdYaUlV+aW6vlGNGKddZWnd8Ff2CCtxpJ q22Cn+obpwVbcGJNxnEnc0/BhaGz12efhrsuaa6ubyIyZS48riR49cPNE1OsCk= X-Proofpoint-GUID: TAXgD_ERG73VAJ-rZk6gIBJZTM7Spttj From: Luka Bai When doing psi_group_change, we always iterate all the cgroups from the child all the way up to the root cgroup. They are all double link list connected so it's hard for the CPU to prefetch this parent. So we tried to add a prefetch for the parent groupc, and it has quite some benefits for the final result. Signed-off-by: Luka Bai --- kernel/sched/psi.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 7374c05a5751..9b7a85d1bc28 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -793,6 +793,15 @@ static void record_times(struct psi_group_cpu *groupc,= u64 now) #define for_each_group(iter, group) \ for (typeof(group) iter =3D group; iter; iter =3D iter->parent) =20 +static inline struct psi_group_cpu *prefetch_and_get_groupc(struct psi_gro= up *group, int cpu) +{ + struct psi_group_cpu *groupc =3D per_cpu_ptr(group->pcpu, cpu); + + if (group->parent) + prefetchw(per_cpu_ptr(group->parent->pcpu, cpu)); + return groupc; +} + static void psi_group_change(struct psi_group *group, int cpu, unsigned int clear, unsigned int set, u64 now, bool wake_clock, bool curr_in_memstall) @@ -802,7 +811,7 @@ static void psi_group_change(struct psi_group *group, i= nt cpu, u32 state_mask; =20 lockdep_assert_rq_held(cpu_rq(cpu)); - groupc =3D per_cpu_ptr(group->pcpu, cpu); + groupc =3D prefetch_and_get_groupc(group, cpu); =20 /* * Start with TSK_ONCPU, which doesn't have a corresponding --=20 2.52.0 From nobody Tue May 26 08:52:39 2026 Received: from outbound.ci.icloud.com (ci-2001d-snip4-10.eps.apple.com [57.103.91.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE0C243D4F5 for ; Tue, 12 May 2026 06:20:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=57.103.91.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566874; cv=none; b=Z05QveQTcL/QZbN0dKgXB5naHMjwyeLszWackKBUel2bk1DgMvMZJEc80eZLRkVZAhbX2HrCQKYs3dhDFarDCPEjSKBofCoWdkauUIC7PI6bxWi67cDY517zBRgyixgPPkd6+yKOtfVKKqxkMii7BYZoGnZOcN6K1nOhhybtDkg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566874; c=relaxed/simple; bh=lzCq7haBc4sca2HsI7iigGUxADMapnK6Tk4EPTZmuyE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ZqTwbnBs+VlCGHmA4Rz6c8PMQdztOClzy4sb7OODMnSJRBMxuC0DJd8mL9+WVguC9s19vSNI3EoOCuQIlVVySMnllJRvd+iM7eiVEVV927sePIK7AVuTtcYvFsahrzlFJK/QAdifDrSOftU7owbTVSvHchNc5w97WfAcxqYxzEA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com; spf=pass smtp.mailfrom=icloud.com; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b=ow4GRMhB; arc=none smtp.client-ip=57.103.91.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=icloud.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b="ow4GRMhB" Received: from outbound.ci.icloud.com (unknown [127.0.0.2]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPS id 9779B1800103; Tue, 12 May 2026 06:20:44 +0000 (UTC) X-ICL-Out-Info: HUtFAUMEWwJACUgBTUQeDx5WFlZNRAJCTQhJB0MFXwReC0sKQw5eEhVdRV8YXApUH1oNQC1eCF4fTBwdDlgGEhZdRV8YXApUH1oNQC1eCF4fTBwdDlgGEgJaRQFbFwNXHFZFXBhDCV0FVxwdDl5FWxNVF0YJGQhdHRkIRx8KMANCDlYDQwdFAC0ZHFdQXgheH0wcHQ5YBhIdUBwOUQVbAEYJTQJfGhtBGWYRXh1FRkRBFEweX1VcVEEJHlcLVg8HME0dXQ5SBUZeWhdeUxcfSwBcRVoOWwRHFA== Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=1a1hai; t=1778566850; x=1781158850; bh=HFQgmeVI0oO/YyDZZt3bpmB4cnLvJvO3q2IZPwc5y8Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:x-icloud-hme; b=ow4GRMhBy3HR3saemv/Dx4xL6GdvF/TMpMEdR5eY1ONuQO7wkW4Swd4qTFXa17Aias7GbIze2IUr/eRh4l32XyiUsJzHh9pLwaprqPU+8chKFycy6NCIkfpvLsWrbYMsUZdhsqiAYCGQLLqBRi1bo8Ym0+0txhw+joxzLTguTR59M6RZ+sHVPHl/Jurf1pDXaWxfhnAQO02gvLTdZrbj16BuscOwEcc4U8yTFeDIhE7W2IqtU2RSXHvvuF+L5j01s2CJHZvNwSu4gh8mqLeXWzuDy0uXxIgTC1oOh31f3tia8kCf+bTm8FMvEyMRcPE6+wO3QjerVehpAikR4AuO5g== Received: from [127.0.0.1] (unknown [17.57.156.36]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPSA id 8424B180013B; Tue, 12 May 2026 06:20:35 +0000 (UTC) From: Luka Bai Date: Tue, 12 May 2026 14:20:00 +0800 Subject: [PATCH 4/6] psi: do not call record_times when the state is not changed Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260512-psi_impr-v1-4-2b7f10fdfad5@tencent.com> References: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> In-Reply-To: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> To: linux-mm@kvack.org Cc: Johannes Weiner , Suren Baghdasaryan , Peter Ziljstra , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Michal Hocko , Kees Cook , Tejun Heo , =?utf-8?q?Michal_Koutn=C3=BD?= , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Luka Bai X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778566802; l=1376; i=lukabai@tencent.com; s=20260501; h=from:subject:message-id; bh=K3W73Ej/H0RH/u7ug92VzPA/oQvz5jykvDzJfWcXDc8=; b=UOPUAuG2SMAddJiUXqFia9+qZ6pYXrAI5NMq3HPcY0QsvJzfIJo2mUffCromL/WOhhHUFI1ss lo2PUu8RNO3C9et3pjuE+4l0GXHz36Pxes0NjKltfze7gQcrJHXY0gR X-Developer-Key: i=lukabai@tencent.com; a=ed25519; pk=KeaVteSWd00GIAjFyWZnuFsKAKixjga1ZkLMcI66nPM= X-Proofpoint-GUID: g9I8dGtUZgJhv2L6LpR1sKeFx3DMaVBn X-Proofpoint-ORIG-GUID: g9I8dGtUZgJhv2L6LpR1sKeFx3DMaVBn X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTEyMDA2MCBTYWx0ZWRfX7NXkfgr0AzlI BXnEsThg68asqk8eaLufSGZHZ5yc5dwqWIlcVz4z10yrJRZ0FsAfTumvp50tBpxWYyJTF5erq8u yNUYifqjuq8QbLxjBK2DHJIvwXcPkca0bWhjmNWxrdLBmL3nyCnTfWJcZIvO296B61gajcwNmns QNNda9lnhk3nlq5c16dFjQzLdXRNiziCn0v63gTkl5gICYqz5GUAD5EDa7BTjgwYR38OFdN1NOW Tvfny2N7FTNLj4AQXz+UZepLPnyqIubWEmjUbwQAy2Ss+/KNYSA8ScW1PyKZR/EHRXc6qcCCGzp +x2lzIs9iyeaqXUDF8ZmCtH155clIcvl1GoS77f/4PRHaawJNao38qQ+HeFI2c= X-Authority-Info-Out: v=2.4 cv=JfSxbEKV c=1 sm=1 tr=0 ts=6a02c6c0 cx=c_apl:c_pps:t_out a=2G65uMN5HjSv0sBfM2Yj2w==:117 a=2G65uMN5HjSv0sBfM2Yj2w==:17 a=jPpEGkWhOON_I5X-:21 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=UaoJkeuwEpQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=GvQkQWPkAAAA:8 a=wbxJtr-Xvdy7Wj_ENd8A:9 a=QEXdDO2ut3YA:10 From: Luka Bai In psi_group_change, record_times is always called no matter whether the state_mask changes. Since it can cost some performance, we choose to not to do it unconditionally. If the state has not changed, we can keep the psi time unchanged. This will not make any difference to the final result since when we need to acquire the psi time, get_recent_times() will always calculate the remaining time into the final result. Signed-off-by: Luka Bai --- kernel/sched/psi.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 9b7a85d1bc28..4c4bd134c785 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -880,9 +880,15 @@ static void psi_group_change(struct psi_group *group, = int cpu, if (unlikely((state_mask & PSI_ONCPU) && curr_in_memstall)) state_mask |=3D (1 << PSI_MEM_FULL); =20 - record_times(groupc, now); - - groupc->state_mask =3D state_mask; + /* + * We only need to record times when the state changes. Or + * we can keep it unchanged and wait for get_recent_times() + * to handle the remaining time. + */ + if (state_mask !=3D groupc->state_mask) { + record_times(groupc, now); + groupc->state_mask =3D state_mask; + } =20 if (state_mask & group->rtpoll_states) psi_schedule_rtpoll_work(group, 1, false); --=20 2.52.0 From nobody Tue May 26 08:52:39 2026 Received: from outbound.ci.icloud.com (ci-2001j-snip4-11.eps.apple.com [57.103.91.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D007947CC8A for ; Tue, 12 May 2026 06:20:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=57.103.91.103 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566874; cv=none; b=gLDXBKWvPTANtWT+9hMrZ5ppGxQASbB5369ABO8w/gKjF5+FN5NNB+tCqcMJUjHQd/LqmTVDXwdg3BtAo/Lz9vUjSEV9D2Z5qAdyi7/1r4LeL1wbGL7BmgGQPWQ9FQN1+0e/VQFvEVNFGzTwmLTHlI4N3GrvDwFaOFEGgf23Ixw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566874; c=relaxed/simple; bh=9FP8n6XGR1QceBaNZQHKC7yOvhgDJhFZnp5vXDSzB/E=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=HcIz2cE19UiYBtygogvC33Y55V5H5R2IqsHZ64ii3hB1R6q3RBTDhSdXF2Gxmh9kD6lBQGYnIwLJ1YRNm3SshSDDEu91WHVSvUS6F85K2ip7hDpXurHSefAUcFRAUTJ6pS5I3ySedTdqGTGPvjZ0f9f5IaOOYRZzCfI/xjcz7jQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com; spf=pass smtp.mailfrom=icloud.com; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b=WfXdgDmA; arc=none smtp.client-ip=57.103.91.103 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=icloud.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b="WfXdgDmA" Received: from outbound.ci.icloud.com (unknown [127.0.0.2]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPS id C11F2180011C; Tue, 12 May 2026 06:20:52 +0000 (UTC) X-ICL-Out-Info: HUtFAUMEWwJACUgBTUQeDx5WFlZNRAJCTQhJB0MFXwReC0sKQw5eEhVdRV8YXApUH1oNQC1eCF4fTBwdDlgGEhZdRV8YXApUH1oNQC1eCF4fTBwdDlgGEgJaRQFbFwNXHFZFXBhDCV0FVxwdDl5FWxNVF0YJGQhdHRkIRx8KMANCDlYDQwdFAC0ZHFdQXgheH0wcHQ5YBhIdUBwOUQVbAEYJTQJfGhtBGWYRXh1FRkRBFE0eX1VcVEEJHlcLVg8HME0dXQ5SBUZeWhdeUxcfSwBcRVoOWwRHFA== Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=1a1hai; t=1778566856; x=1781158856; bh=mKgIAnZB5fUXzrw1IdBcNeVNTkRS5dYeFy3GvsnLKiY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:x-icloud-hme; b=WfXdgDmAI2tBeLM82M0xxxdMvosHkKR1IXKAu/jjSOrU8tQl/ZxyO3RH/ha7K4L+3NlLDISMcPN0ZrGkBDF8WI30KLA9MLcBFBHGNQ4cVOim34Ff5eJ4TX5DdPwpNpwjXprDJ4PQQg+6WG5wq7ByFoDCk+J0CHB2hV6WWpxIa1I3v8FY7lyYBaIQSc6NEz3T8+Zk1NdwdYvvpAAv3Zp0J4Hg923FmcjQ38IwOgk6RGi6iwCeiqyxvsuxYgmg7E+CQLTr0tYRxTfYQN/+rf5N7v2QSN3rcxkNF+sD/HTl/mG2NhRRgPTPJnSGAwUmpovYqVgjulEgfPivLYsRlYTnNw== Received: from [127.0.0.1] (unknown [17.57.156.36]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPSA id 5BCAF18000B9; Tue, 12 May 2026 06:20:44 +0000 (UTC) From: Luka Bai Date: Tue, 12 May 2026 14:20:01 +0800 Subject: [PATCH 5/6] psi: add psi group for the root cgroup Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260512-psi_impr-v1-5-2b7f10fdfad5@tencent.com> References: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> In-Reply-To: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> To: linux-mm@kvack.org Cc: Johannes Weiner , Suren Baghdasaryan , Peter Ziljstra , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Michal Hocko , Kees Cook , Tejun Heo , =?utf-8?q?Michal_Koutn=C3=BD?= , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Luka Bai X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778566802; l=1490; i=lukabai@tencent.com; s=20260501; h=from:subject:message-id; bh=hegqJxsefRUo0IZb6av0loX0beqXtKAilCIuw9AGMOg=; b=PpQmxUSe15FVJGWEVgQ54mSJ/1lP1TFKzTir1B2BYtnj3SMHDa69594F8Hmf2TlrNR0dmx2qS Q4m22WKPwWwD4F4CLCngciseUZUM95VA5i6FGqMEAV4qErxozcS0EKq X-Developer-Key: i=lukabai@tencent.com; a=ed25519; pk=KeaVteSWd00GIAjFyWZnuFsKAKixjga1ZkLMcI66nPM= X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTEyMDA2MCBTYWx0ZWRfX0EYYtqAKu+rD yhcbvq0Iuaphbygoj5MTpyMOFj/S9zSD7ipPt2xh10r7q6GLMgeWnDOIcMhwQwafZcILtGCijJe SbmPA0Gnc+rUuIib7XjHzO1JEq98Obm3T6djFk6ub0sPso4NIRLwYp5OFingNelgizgGP+jvgXl NQvcUDByWZO1dEFVDB1k38KRMLLNd1u/0vYTQilKG9sz45w0VRy1Ut869gv1hWFslAeF9AyAYSK e69KW+mGRkPOAtA641O+e4T52Zl2XC+afS/pYAmNvMitoPMaa704jw6L7J8HjhHQ9caTLdEgtNp Ds/YlM6zgzuj6xvEB1Ig8gEm52uHMjGdswtA6sHjNq7abTSQiYRzmJOAaVtIn4= X-Authority-Info-Out: v=2.4 cv=BryQAIX5 c=1 sm=1 tr=0 ts=6a02c6c7 cx=c_apl:c_pps:t_out a=2G65uMN5HjSv0sBfM2Yj2w==:117 a=2G65uMN5HjSv0sBfM2Yj2w==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=UaoJkeuwEpQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=GvQkQWPkAAAA:8 a=PktQUFpngk8bp4KwMBsA:9 a=QEXdDO2ut3YA:10 X-Proofpoint-GUID: B6dnkmnWxy2EQg2uAF8glNidlGTVlMce X-Proofpoint-ORIG-GUID: B6dnkmnWxy2EQg2uAF8glNidlGTVlMce From: Luka Bai Cgroup_psi() now includes a condition, and checks against whether the cgroup is the root cgroup to decide whether to use psi_system instead of cgrp->psi. This is mostly because the default hierarchy does not have any psi group attached. So we make psi_system as its psi group, and remove the if condition in cgroup_psi(). Signed-off-by: Luka Bai --- include/linux/psi.h | 2 +- kernel/cgroup/cgroup.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/include/linux/psi.h b/include/linux/psi.h index e0745873e3f2..8f2db511d051 100644 --- a/include/linux/psi.h +++ b/include/linux/psi.h @@ -34,7 +34,7 @@ __poll_t psi_trigger_poll(void **trigger_ptr, struct file= *file, #ifdef CONFIG_CGROUPS static inline struct psi_group *cgroup_psi(struct cgroup *cgrp) { - return cgroup_ino(cgrp) =3D=3D 1 ? &psi_system : cgrp->psi; + return cgrp->psi; } =20 int psi_cgroup_alloc(struct cgroup *cgrp); diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 43adc96c7f1a..357c68662d18 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -178,6 +178,9 @@ static DEFINE_PER_CPU(struct cgroup_rstat_base_cpu, roo= t_rstat_base_cpu); /* the default hierarchy */ struct cgroup_root cgrp_dfl_root =3D { .cgrp.self.rstat_cpu =3D &root_rstat_cpu, +#ifdef CONFIG_PSI + .cgrp.psi =3D &psi_system, +#endif .cgrp.rstat_base_cpu =3D &root_rstat_base_cpu, }; EXPORT_SYMBOL_GPL(cgrp_dfl_root); --=20 2.52.0 From nobody Tue May 26 08:52:39 2026 Received: from outbound.ci.icloud.com (ci-2001j-snip4-3.eps.apple.com [57.103.91.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 770BC43637C for ; Tue, 12 May 2026 06:21:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=57.103.91.96 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566883; cv=none; b=Yk5EERcgA0VDZRKR4JlHa6kPKTFQVOi5YaEoget1vPmuLwu/Nh3gMukw40yIfwKGIdU0Vr22xOkI9DuHaV1Xg/c1xggoZMkawcT4Y7G8BkDokpF1Qnj1oawkXERd4poHvdGY6J8RRAG0XMjz/GsnNrx8ws7Z/SwI364it3sav4Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778566883; c=relaxed/simple; bh=gNQt9bon6ln9TfhTZIj7An7R0o+Le3B+T4yie/Fk9ao=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VyNs/XmhNIGA6zBRbFvVI27WqIsQp5RUMHfCRMOrU81j1+YjSyW9EHAS6y6MQm39K9YjoUf3/StY+DhqSSHvD1hHJMZWgA18ghs81HGZrTaR3vy9DSxDMaTBVFY9V5dNGkkJrXzTh/0GfRt0LMuuPGsZ4GkH/QVWqfjRNi/RCoI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com; spf=pass smtp.mailfrom=icloud.com; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b=HnSdxRDm; arc=none smtp.client-ip=57.103.91.96 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=icloud.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=icloud.com header.i=@icloud.com header.b="HnSdxRDm" Received: from outbound.ci.icloud.com (unknown [127.0.0.2]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPS id 8D51B1800169; Tue, 12 May 2026 06:21:00 +0000 (UTC) X-ICL-Out-Info: HUtFAUMEWwJACUgBTUQeDx5WFlZNRAJCTQhJB0MFXwReC0sKQw5eEhVdRV8YXApUH1oNQC1eCF4fTBwdDlgGEhZdRV8YXApUH1oNQC1eCF4fTBwdDlgGEgJaRQFbFwNXHFZFXBhDCV0FVxwdDl5FWxNVF0YJGQhdHRkIRx8KMANCDlYDQwdFAC0ZHFdQXgheH0wcHQ5YBhIdUBwOUQVbAEYJTQJfGhtBGWYRXh1FRkRBFE4eX1VcVEEJHlcLVg8HME0dXQ5SBUZeWhdeUxcfSwBcRVoOWwRHFA== Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=1a1hai; t=1778566865; x=1781158865; bh=R7WZwsOB3CgUd4zcpnHi7ylJ/W9SAO8MOHEKfoRSL5A=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:x-icloud-hme; b=HnSdxRDmyDHxxGcgB6hX4nuT6ZqDDeP7hvyGEfS4LRVQ9X+oHmi/ealchoyPYUmbiTKr99/CWKVv8ibjugU1jkchcuUvFj2ORzNxA+9Dq2zNcg1JzWZcWqMrzWo8eBOsbG1qFK0V17qWP3y3VdDeLmAHk+8V86eFSfSkxpSIewVlrn72BWgkIsvZUKv907ta1MmQUX+LeBwOgnhG/8mm83w4vvAhL1u10xGWfPVn8oF2kNdwM/SY65yanfJdRoksN/B3Ae3Tdxxj6SrT4oO1THdDJFIIIH1A77rAYzJCzIN0Fsfm6/FMNIrzTWbGiSDDMJtCyR5LVs9OHKJSDZMVPw== Received: from [127.0.0.1] (unknown [17.57.156.36]) by p00-icloudmta-asmtp-us-central-1k-100-percent-4 (Postfix) with ESMTPSA id 92C81180016C; Tue, 12 May 2026 06:20:52 +0000 (UTC) From: Luka Bai Date: Tue, 12 May 2026 14:20:02 +0800 Subject: [PATCH 6/6] psi: remove psi_bug and moves checking of NR_RUNNING ahead. Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260512-psi_impr-v1-6-2b7f10fdfad5@tencent.com> References: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> In-Reply-To: <20260512-psi_impr-v1-0-2b7f10fdfad5@tencent.com> To: linux-mm@kvack.org Cc: Johannes Weiner , Suren Baghdasaryan , Peter Ziljstra , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Michal Hocko , Kees Cook , Tejun Heo , =?utf-8?q?Michal_Koutn=C3=BD?= , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Luka Bai X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778566802; l=2880; i=lukabai@tencent.com; s=20260501; h=from:subject:message-id; bh=6q4L4uq69DKt/CZJ1lu/3JvGNmB9tf3m8tlVZoTYSQg=; b=slephPrnVgdHjw/+xnQlaJ5+kdbd2sfsfq2sIgLIGUkwgJZ4dCV4BuiON5JrJgQGV1sEI/IsO dEiXPb8JBCQDoJYucm0uqyKoAMhhKAu3MZtlXNAg3A89RU8wTfqQfBV X-Developer-Key: i=lukabai@tencent.com; a=ed25519; pk=KeaVteSWd00GIAjFyWZnuFsKAKixjga1ZkLMcI66nPM= X-Authority-Info-Out: v=2.4 cv=GMUF0+NK c=1 sm=1 tr=0 ts=6a02c6cf cx=c_apl:c_pps:t_out a=2G65uMN5HjSv0sBfM2Yj2w==:117 a=2G65uMN5HjSv0sBfM2Yj2w==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=UaoJkeuwEpQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=GvQkQWPkAAAA:8 a=Ee36wLBZbuEWgZn7rsAA:9 a=QEXdDO2ut3YA:10 X-Proofpoint-GUID: g4Q5OHPjTgqPAB5qPTQmBZ7cS488dJIr X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTEyMDA2MCBTYWx0ZWRfX/akZhom3E3Mp ZK3GJGTPs0Dt81nwy1vbFDzb9wORUetUdzgi/h13hsgnrPxZadwvVbQe9k1gZS2uNoRh5uCbwHG IpoU6N9uKnSOeeL2cqA5feDS7PPT0/P1VY0A+nlayNV6c5+nShOLlVQDZSEEdqqVvGINfATawDM zlBZOHti0EVgG5zro2PpOvkqJjwmdqQrgyQTL0b4w7eRtdsb/YYOB9bnWWSwyvujPtjnMvqpa4m o0Ydn0eYnDV+3Ng+A0C6bBhg3uMK/jvdng0P2X4oakv0V2nHTGUEAZSdSdU8MMSulVVhSzpOH+M dqZe0HZPDGFgFgxfgKfblSR7yvzznD3RjnfH6+2keJEgDUdQTWnRNYJqQvlvnc= X-Proofpoint-ORIG-GUID: g4Q5OHPjTgqPAB5qPTQmBZ7cS488dJIr From: Luka Bai During the accounting of psi states we'd like to do some bug detection to make it more maintainable. And we use the variable psi_bug to make it print once. We would like to use printk_deferred_once to replace the usage of psi_bug since their effect are similar, and this can also increase the readability. Also, use likely and unlikely in these bug detection branches. Signed-off-by: Luka Bai --- kernel/sched/psi.c | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 4c4bd134c785..70dd642af5e0 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -141,8 +141,6 @@ #include #include "sched.h" =20 -static int psi_bug __read_mostly; - DEFINE_STATIC_KEY_FALSE(psi_disabled); static DEFINE_STATIC_KEY_TRUE(psi_cgroups_enabled); =20 @@ -262,7 +260,7 @@ static u32 test_states(unsigned int *tasks, u32 state_m= ask) if (tasks[NR_RUNNING] && !oncpu) state_mask |=3D BIT(PSI_CPU_FULL); =20 - if (tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING]) + if (tasks[NR_RUNNING] || tasks[NR_MEMSTALL] || tasks[NR_IOWAIT]) state_mask |=3D BIT(PSI_NONIDLE); =20 return state_mask; @@ -836,14 +834,13 @@ static void psi_group_change(struct psi_group *group,= int cpu, for (t =3D 0, m =3D clear; m; m &=3D ~(1 << t), t++) { if (!(m & (1 << t))) continue; - if (groupc->tasks[t]) { + if (likely(groupc->tasks[t])) { groupc->tasks[t]--; - } else if (!psi_bug) { - printk_deferred(KERN_ERR "psi: task underflow! cpu=3D%d t=3D%d tasks=3D= [%u %u %u %u] clear=3D%x set=3D%x\n", + } else { + printk_deferred_once("psi: task underflow! cpu=3D%d t=3D%d tasks=3D[%u = %u %u %u] clear=3D%x set=3D%x\n", cpu, t, groupc->tasks[0], groupc->tasks[1], groupc->tasks[2], groupc->tasks[3], clear, set); - psi_bug =3D 1; } } =20 @@ -908,13 +905,11 @@ static inline struct psi_group *task_psi_group(struct= task_struct *task) =20 static void psi_flags_change(struct task_struct *task, int clear, int set) { - if (((task->psi_flags & set) || - (task->psi_flags & clear) !=3D clear) && - !psi_bug) { - printk_deferred(KERN_ERR "psi: inconsistent task state! task=3D%d:%s cpu= =3D%d psi_flags=3D%x clear=3D%x set=3D%x\n", + if (unlikely(((task->psi_flags & set) || + (task->psi_flags & clear) !=3D clear))) { + printk_deferred_once("psi: inconsistent task state! task=3D%d:%s cpu=3D%= d psi_flags=3D%x clear=3D%x set=3D%x\n", task->pid, task->comm, task_cpu(task), task->psi_flags, clear, set); - psi_bug =3D 1; } =20 task->psi_flags &=3D ~clear; @@ -927,7 +922,7 @@ void psi_task_change(struct task_struct *task, int clea= r, int set) u64 now; bool curr_in_memstall; =20 - if (!task->need_psi) + if (unlikely(!task->need_psi)) return; =20 psi_flags_change(task, clear, set); --=20 2.52.0