From nobody Thu May 14 06:44:07 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C278C433EF for ; Fri, 8 Apr 2022 12:19:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233657AbiDHMVk (ORCPT ); Fri, 8 Apr 2022 08:21:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229550AbiDHMVh (ORCPT ); Fri, 8 Apr 2022 08:21:37 -0400 Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F192D32F086 for ; Fri, 8 Apr 2022 05:19:33 -0700 (PDT) Received: by mail-pf1-x42b.google.com with SMTP id b15so8236508pfm.5 for ; Fri, 08 Apr 2022 05:19:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=+8uoGZ/zdKm46FPD8jLlw4/DdbIugQvQ3ge6j/WFY9I=; b=r9alONQrNmBKLORSXwRuoMc4FLWhD2ZEW1S4Sx08BDjBZhe3fkGDMOhwey34pOLBo9 w5vnOiOY9Q0OZI2q476bg4QNmkjDYrSXUILeG7MXYmG7TFdYNuMzGtGudz0MIDDxH3x0 eCxWsTvEXO6bBw+mTwzRUFkUdI8MpQbLkBUo6WLOfjILXCxEM2zl8iJjVCHIIHzEuCIW ZZeBO5o702zlvcd/6pZ0ndx7N2nYMnTMRol9lNpAw+0d6zSLBrtBd8KwjhzF0rxKBfaA rWDDy1eC1o0jKhswzYr++cjOD9Dcw5u0716tX8Tsf/u+qMjhKpy1OfrbkgaVgo49BEMU D9AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=+8uoGZ/zdKm46FPD8jLlw4/DdbIugQvQ3ge6j/WFY9I=; b=n/+4t4hO3YufvwSAWrV0Z/aJSjyPDfV+sPIBv0lDc07czsZ5WT1hKJtl7ndsYoudZ9 mglCYkiV5ouTwok+pMSQwTqLqAKNJ2fG5FBRhhnu1NCsHxuom1qWlP20s9+aUAWRngBY 35xB4BmF2OZWvGzaWxfltIOV3uGQksY0FlE8HTPr+1x8voa4rOpp7RYZFjP9OL06VOmu hf0on1g3qOTlQ29EBoSQnPcsieRcassdp4GiI++sFrT+NLluTQrCvy/trwEwf7BEza1Y gyAXg1uFr4VtXcoG2nVPTMueal1c7OoK8SqoK3k+rpnZ7+ueBfoqcDm08rLO6GGu8j3k yVgA== X-Gm-Message-State: AOAM5319pbGYpbwaL+imuoNDOwGK6QCLSR+Bp8GjsbZUUkzxYk/vtI/5 8Iwy1y7b9JSrVH/zQ6T9+QwIYg== X-Google-Smtp-Source: ABdhPJwB4KK92tFeRfcNQEfhh/5Cd25Z2vJ0y/DcPl7CAKdFpzx+NrRlyFuJcEfg4SBBO24NMGMACA== X-Received: by 2002:a05:6a00:1152:b0:4be:ab79:fcfa with SMTP id b18-20020a056a00115200b004beab79fcfamr19511592pfm.3.1649420373481; Fri, 08 Apr 2022 05:19:33 -0700 (PDT) Received: from localhost.localdomain ([2409:8a28:e6c:f010:a5e3:608e:7730:5b8e]) by smtp.gmail.com with ESMTPSA id oo16-20020a17090b1c9000b001b89e05e2b2sm13438465pjb.34.2022.04.08.05.19.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Apr 2022 05:19:33 -0700 (PDT) From: Chengming Zhou To: corbet@lwn.net, hannes@cmpxchg.org, surenb@google.com, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, ebiggers@google.com, zhouchengming@bytedance.com, songmuchun@bytedance.com Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, duanxiongchun@bytedance.com, Martin Steigerwald Subject: [PATCH RESEND v2] sched/psi: report zeroes for CPU full at the system level Date: Fri, 8 Apr 2022 20:19:14 +0800 Message-Id: <20220408121914.82855-1-zhouchengming@bytedance.com> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Martin find it confusing when look at the /proc/pressure/cpu output, and found no hint about that CPU "full" line in psi Documentation. % cat /proc/pressure/cpu some avg10=3D0.92 avg60=3D0.91 avg300=3D0.73 total=3D933490489 full avg10=3D0.22 avg60=3D0.23 avg300=3D0.16 total=3D358783277 The PSI_CPU_FULL state is introduced by commit e7fcd7622823 ("psi: Add PSI_CPU_FULL state"), which mainly for cgroup level, but also counted at the system level as a side effect. Naturally, the FULL state doesn't exist for the CPU resource at the system level. These "full" numbers can come from CPU idle schedule latency. For example, t1 is the time when task wakeup on an idle CPU, t2 is the time when CPU pick and switch to it. The delta of (t2 - t1) will be in CPU_FULL state. Another case all processes can be stalled is when all cgroups have been throttled at the same time, which unlikely to happen. Anyway, CPU_FULL metric is meaningless and confusing at the system level. So this patch will report zeroes for CPU full at the system level, and update psi Documentation accordingly. Fixes: e7fcd7622823 ("psi: Add PSI_CPU_FULL state") Reported-by: Martin Steigerwald Suggested-by: Johannes Weiner Acked-by: Johannes Weiner Signed-off-by: Chengming Zhou --- v2: - report zeroes for CPU full at the system level, suggested by Johannes. - update doc about the zeroes in CPU full at the system level. --- Documentation/accounting/psi.rst | 9 ++++----- kernel/sched/psi.c | 15 +++++++++------ 2 files changed, 13 insertions(+), 11 deletions(-) diff --git a/Documentation/accounting/psi.rst b/Documentation/accounting/ps= i.rst index 860fe651d645..5e40b3f437f9 100644 --- a/Documentation/accounting/psi.rst +++ b/Documentation/accounting/psi.rst @@ -37,11 +37,7 @@ Pressure interface Pressure information for each resource is exported through the respective file in /proc/pressure/ -- cpu, memory, and io. =20 -The format for CPU is as such:: - - some avg10=3D0.00 avg60=3D0.00 avg300=3D0.00 total=3D0 - -and for memory and IO:: +The format is as such:: =20 some avg10=3D0.00 avg60=3D0.00 avg300=3D0.00 total=3D0 full avg10=3D0.00 avg60=3D0.00 avg300=3D0.00 total=3D0 @@ -58,6 +54,9 @@ situation from a state where some tasks are stalled but t= he CPU is still doing productive work. As such, time spent in this subset of the stall state is tracked separately and exported in the "full" averages. =20 +CPU full is undefined at the system level, but has been reported +since 5.13, so it is set to zero for backward compatibility. + The ratios (in %) are tracked as recent trends over ten, sixty, and three hundred second windows, which gives insight into short term events as well as medium and long term trends. The total absolute stall time diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index a4fa3aadfcba..ed9fb557dadd 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -1060,14 +1060,17 @@ int psi_show(struct seq_file *m, struct psi_group *= group, enum psi_res res) mutex_unlock(&group->avgs_lock); =20 for (full =3D 0; full < 2; full++) { - unsigned long avg[3]; - u64 total; + unsigned long avg[3] =3D { 0, }; + u64 total =3D 0; int w; =20 - for (w =3D 0; w < 3; w++) - avg[w] =3D group->avg[res * 2 + full][w]; - total =3D div_u64(group->total[PSI_AVGS][res * 2 + full], - NSEC_PER_USEC); + /* CPU FULL is undefined at the system level */ + if (!(group =3D=3D &psi_system && res =3D=3D PSI_CPU && full)) { + for (w =3D 0; w < 3; w++) + avg[w] =3D group->avg[res * 2 + full][w]; + total =3D div_u64(group->total[PSI_AVGS][res * 2 + full], + NSEC_PER_USEC); + } =20 seq_printf(m, "%s avg10=3D%lu.%02lu avg60=3D%lu.%02lu avg300=3D%lu.%02lu= total=3D%llu\n", full ? "full" : "some", --=20 2.35.1