From nobody Thu Dec 18 05:58:56 2025 Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 977051A0B1A for ; Tue, 30 Jul 2024 15:02:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722351755; cv=none; b=Kubig63Isp7JC7vHJ966F0B9wmn+QmiPcIaMZd3ekuZkJPOHTMXqngn/xmqg367nqyzKvVjHUa6tldrQ3Ic5TgcVXyWzmOY3Mb9KEA9bei8eQQZ0ntGVAVQrm3rfQsUcw8n5oV9IKD+EHKUEunlhW79i5sRXya5FKZsciYQyzR8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722351755; c=relaxed/simple; bh=dHKZkmG/VXTGH5N/BUNRvkBMgEHtHZVj7W0wVsS460g=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cK9/48sTKfyoOUdZUEQNVpTBkmuBD2GAo2uElt1xt9dUFpDudDV8phvFtvRfwDRTcrz+suv54eSSIfrjda4pSJsSmUwPE+wS6LjH2nlv4e1cHaXdsCfmqDbvhozZJ8N0nN5EfX+KWTDbFkHJGWR+ChPEE9hOSvfj1V1gZwVm+Go= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=Nm11yIpS; arc=none smtp.client-ip=209.85.222.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="Nm11yIpS" Received: by mail-qk1-f176.google.com with SMTP id af79cd13be357-7a1e1237e9aso286541885a.3 for ; Tue, 30 Jul 2024 08:02:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1722351752; x=1722956552; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=qK7WOu+hv7tl+cxtg7CXPtDPtjLG3K7kUlaYAaWdes0=; b=Nm11yIpSZK/N/RD92e3yEqnL/nQwH0884Nl/K7g9kCJ27Pbm+1u2UwO1/nXmSI37s+ EOFDJKAdmpLq/7nQAJecEiGrtK5PTPnVstKK4xOXn+ofSZfs9c7wnq4VBecvhHolGXOh qQEDyBL/hjw95gIc6luBUl8xfFVm+P2rFu+FLCEXd9ZIIcMTkAfW2SLMR2EVhBEX4VZt 80EPRu9QMZuRjCYTn63+M2AzFgEe5BPjIF4KgDHxgQwyzUWmjSp6Qb7KgygPhyvw0knx CepLK4m/qeQdoRaYp1YgFfcGSVuh2JLLPPAJlcva5FtaNuSBVqJDNNujPkr6WjjcX6Us GZiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722351752; x=1722956552; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qK7WOu+hv7tl+cxtg7CXPtDPtjLG3K7kUlaYAaWdes0=; b=kKL/p0cc6Bb2vc98EaFD2J+kaw30hMQkzCCwPSm89rdRBbDKlGRSLsmrKQiTbyvW8Z GL3g1i8efv5Aa5D4gEj0z08brPGqebxtOee5bU+09ax7z9AfWxZSAYi+Gy/cnU8zdS/s sWSFsf6WSEc9nvD+y4K+ecNyAYquo5Xwv7YIiaySlvNUsEg9a90aysLvAdyoNKEOUPuW o3cg42e7YROHOJg5jrLCEXBnRaBJOxSpoTtDRnHQAYi8U+1zuEdGGTeAElYoFjnPztQW ne1uPdTCRxnuk+aNDaeGbaIkKpRD7i2NLiODdy/XuVuPUbgX+eKtSIK1mR39r9gj25QV SiIQ== X-Forwarded-Encrypted: i=1; AJvYcCU4ojrmDMjl5l7P/7fUa334bdrIeG5CnuHcFvhDlMkZQ209c/reH+lpDhXAVIB8ZfZJHeIOUt40PwbYp54hugkfR3+5RxO/XUj9JqD7 X-Gm-Message-State: AOJu0YzowI+5TWsOdtICj/OELMzuVqmDdvNOT8FCNoiIGcVcHpnbgVYu BrxWe/RgGABhDtjTl8cydiz4b50Fgsn0LKvLMRzpo+QZtEpJxfbPEBcrb/nTG40= X-Google-Smtp-Source: AGHT+IGuBFGt6MK7hI5v2XIYYwSP41dPGObuYRlJA5D+J3MWpue7Ez/s4WblQus1adMdjSSsNHR4iQ== X-Received: by 2002:a05:620a:179f:b0:79c:e7d:22b6 with SMTP id af79cd13be357-7a1e525fafbmr1390496785a.39.1722351722274; Tue, 30 Jul 2024 08:02:02 -0700 (PDT) Received: from soleen.c.googlers.com.com (197.5.86.34.bc.googleusercontent.com. [34.86.5.197]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d73efffdsm645934285a.69.2024.07.30.08.02.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 08:02:01 -0700 (PDT) From: Pasha Tatashin To: akpm@linux-foundation.org, jpoimboe@kernel.org, pasha.tatashin@soleen.com, kent.overstreet@linux.dev, peterz@infradead.org, nphamcs@gmail.com, cerasuolodomenico@gmail.com, surenb@google.com, lizhijian@fujitsu.com, willy@infradead.org, shakeel.butt@linux.dev, vbabka@suse.cz, ziy@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, yosryahmed@google.com Subject: [PATCH v6 1/3] memcg: increase the valid index range for memcg stats Date: Tue, 30 Jul 2024 15:01:56 +0000 Message-ID: <20240730150158.832783-2-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog In-Reply-To: <20240730150158.832783-1-pasha.tatashin@soleen.com> References: <20240730150158.832783-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Shakeel Butt At the moment the valid index for the indirection tables for memcg stats and events is < S8_MAX. These indirection tables are used in performance critical codepaths. With the latest addition to the vm_events, the NR_VM_EVENT_ITEMS has gone over S8_MAX. One way to resolve is to increase the entry size of the indirection table from int8_t to int16_t but this will increase the potential number of cachelines needed to access the indirection table. This patch took a different approach and make the valid index < U8_MAX. In this way the size of the indirection tables will remain same and we only need to invalid index check from less than 0 to equal to U8_MAX. In this approach we have also removed a subtraction from the performance critical codepaths. Signed-off-by: Shakeel Butt Co-developed-by: Pasha Tatashin Signed-off-by: Pasha Tatashin Reviewed-by: Yosry Ahmed --- mm/memcontrol.c | 50 +++++++++++++++++++++++++++---------------------- 1 file changed, 28 insertions(+), 22 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 960371788687..84f383952d32 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -320,24 +320,27 @@ static const unsigned int memcg_stat_items[] =3D { #define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items) #define MEMCG_VMSTAT_SIZE (NR_MEMCG_NODE_STAT_ITEMS + \ ARRAY_SIZE(memcg_stat_items)) -static int8_t mem_cgroup_stats_index[MEMCG_NR_STAT] __read_mostly; +#define BAD_STAT_IDX(index) ((u32)(index) >=3D U8_MAX) +static u8 mem_cgroup_stats_index[MEMCG_NR_STAT] __read_mostly; =20 static void init_memcg_stats(void) { - int8_t i, j =3D 0; + u8 i, j =3D 0; =20 - BUILD_BUG_ON(MEMCG_NR_STAT >=3D S8_MAX); + BUILD_BUG_ON(MEMCG_NR_STAT >=3D U8_MAX); =20 - for (i =3D 0; i < NR_MEMCG_NODE_STAT_ITEMS; ++i) - mem_cgroup_stats_index[memcg_node_stat_items[i]] =3D ++j; + memset(mem_cgroup_stats_index, U8_MAX, sizeof(mem_cgroup_stats_index)); =20 - for (i =3D 0; i < ARRAY_SIZE(memcg_stat_items); ++i) - mem_cgroup_stats_index[memcg_stat_items[i]] =3D ++j; + for (i =3D 0; i < NR_MEMCG_NODE_STAT_ITEMS; ++i, ++j) + mem_cgroup_stats_index[memcg_node_stat_items[i]] =3D j; + + for (i =3D 0; i < ARRAY_SIZE(memcg_stat_items); ++i, ++j) + mem_cgroup_stats_index[memcg_stat_items[i]] =3D j; } =20 static inline int memcg_stats_index(int idx) { - return mem_cgroup_stats_index[idx] - 1; + return mem_cgroup_stats_index[idx]; } =20 struct lruvec_stats_percpu { @@ -369,7 +372,7 @@ unsigned long lruvec_page_state(struct lruvec *lruvec, = enum node_stat_item idx) return node_page_state(lruvec_pgdat(lruvec), idx); =20 i =3D memcg_stats_index(idx); - if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) + if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return 0; =20 pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); @@ -392,7 +395,7 @@ unsigned long lruvec_page_state_local(struct lruvec *lr= uvec, return node_page_state(lruvec_pgdat(lruvec), idx); =20 i =3D memcg_stats_index(idx); - if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) + if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return 0; =20 pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); @@ -435,21 +438,24 @@ static const unsigned int memcg_vm_event_stat[] =3D { }; =20 #define NR_MEMCG_EVENTS ARRAY_SIZE(memcg_vm_event_stat) -static int8_t mem_cgroup_events_index[NR_VM_EVENT_ITEMS] __read_mostly; +static u8 mem_cgroup_events_index[NR_VM_EVENT_ITEMS] __read_mostly; =20 static void init_memcg_events(void) { - int8_t i; + u8 i; + + BUILD_BUG_ON(NR_VM_EVENT_ITEMS >=3D U8_MAX); =20 - BUILD_BUG_ON(NR_VM_EVENT_ITEMS >=3D S8_MAX); + memset(mem_cgroup_events_index, U8_MAX, + sizeof(mem_cgroup_events_index)); =20 for (i =3D 0; i < NR_MEMCG_EVENTS; ++i) - mem_cgroup_events_index[memcg_vm_event_stat[i]] =3D i + 1; + mem_cgroup_events_index[memcg_vm_event_stat[i]] =3D i; } =20 static inline int memcg_events_index(enum vm_event_item idx) { - return mem_cgroup_events_index[idx] - 1; + return mem_cgroup_events_index[idx]; } =20 struct memcg_vmstats_percpu { @@ -621,7 +627,7 @@ unsigned long memcg_page_state(struct mem_cgroup *memcg= , int idx) long x; int i =3D memcg_stats_index(idx); =20 - if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) + if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return 0; =20 x =3D READ_ONCE(memcg->vmstats->state[i]); @@ -662,7 +668,7 @@ void __mod_memcg_state(struct mem_cgroup *memcg, enum m= emcg_stat_item idx, if (mem_cgroup_disabled()) return; =20 - if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) + if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return; =20 __this_cpu_add(memcg->vmstats_percpu->state[i], val); @@ -675,7 +681,7 @@ unsigned long memcg_page_state_local(struct mem_cgroup = *memcg, int idx) long x; int i =3D memcg_stats_index(idx); =20 - if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) + if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return 0; =20 x =3D READ_ONCE(memcg->vmstats->state_local[i]); @@ -694,7 +700,7 @@ static void __mod_memcg_lruvec_state(struct lruvec *lru= vec, struct mem_cgroup *memcg; int i =3D memcg_stats_index(idx); =20 - if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) + if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return; =20 pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); @@ -810,7 +816,7 @@ void __count_memcg_events(struct mem_cgroup *memcg, enu= m vm_event_item idx, if (mem_cgroup_disabled()) return; =20 - if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) + if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return; =20 memcg_stats_lock(); @@ -823,7 +829,7 @@ unsigned long memcg_events(struct mem_cgroup *memcg, in= t event) { int i =3D memcg_events_index(event); =20 - if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, event)) + if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, ev= ent)) return 0; =20 return READ_ONCE(memcg->vmstats->events[i]); @@ -833,7 +839,7 @@ unsigned long memcg_events_local(struct mem_cgroup *mem= cg, int event) { int i =3D memcg_events_index(event); =20 - if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, event)) + if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, ev= ent)) return 0; =20 return READ_ONCE(memcg->vmstats->events_local[i]); --=20 2.46.0.rc1.232.g9752f9e123-goog From nobody Thu Dec 18 05:58:56 2025 Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F6451A2553 for ; Tue, 30 Jul 2024 15:02:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722351755; cv=none; b=ip6jW7YWSHaVLPvZ6/iZgKAHhcxnXpb0uBcLTq09YBJHQGUhoPHy15d2fAS+rXuYUVLEp7Iogft5PP7BuC+SSaQ+D63M9ESQI2EQeqU6ytVpSa0UqsKP0XDGaMjlJRUPUb1qirOYJswkc9ciHj83owzorLSfJ0rJ9asjuCtzpNc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722351755; c=relaxed/simple; bh=1Ta+71ia1QNJpUQgeIqSkdMoJ9Fa4QUq8xj5TWpqvss=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=p9/ylJC6ndH/87CE9tnXGKPB6TJNXZ+jPoagc9EPVg3IDSZ85oooNoL+uDf48i2wOL/jBf2eYtP6SAVBLluoKB20VYJ5jelkKxG37zSvZ9SrRauo+ZSqMMc5HGQIXjty4ZIz3aHRDg9hwNuVg9aIiUUcP1ZIk6op04e6iVAvJKQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=notl7E3Z; arc=none smtp.client-ip=209.85.222.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="notl7E3Z" Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-7a1da7cafccso228478185a.3 for ; Tue, 30 Jul 2024 08:02:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1722351753; x=1722956553; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=p8PfWoyQ/BZ4vL4BkdBJYX/P6ybvfxcH/D6ib7uXKjM=; b=notl7E3ZMgW6zgk3VAIf4jfei1g9Ra+cCleooQtusBpjRY8ilVaX73t+UjR3tdVWq4 oYB3rd36ObO6C8xNBAqY7CRIwou33PlGvMVgqgrxK4xLZHuMYBHU17EETWr0eiwvx1bp VCT1FM+qUlA05spyDOQ+Trkf0zm0NYjjGvI+Km3wTcbwo8+PO335qkJ/qPIsijZc8srN v9cNQsjUYhVdMjnlPTCkhS7F6A4yrtgHxkQZUzZ7gCxHEZapksS1NORio34cuZNWUb4a hEupC9LbtdRS/aCSntijjgpWyJ52A8dzhkFcWnGmMzUwmXUwrwgtWL1KxNh9gDqpX6or s9sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722351753; x=1722956553; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p8PfWoyQ/BZ4vL4BkdBJYX/P6ybvfxcH/D6ib7uXKjM=; b=WT+wOkIpJsPy/FWvauBbd9jiYO/Vsowz1kE/SbfAu0qrWqT0zdi0WOfOTHzoyyn4mP LujC2En3njdi5DAbGAOWOriF8XeS2mGrfJpCpT4Bbx0hijAmvV9trXqGOAHBb0u4yJVC YwNQInMf2jabaUMD1jSYmIrVA5UWv0Dw9LVG0lYMYEuQhqiwEQUmcS3DNw8jNB5ivaYQ rr35EzNusZkCRDcQL3UhJHVhk4Zom7SytC4ZQiob/Uai/1A+64DGhJBOZZmf6/JRYfd0 M7Q/PkLU/iZKeQU/C2jHHJZuEgQWBlIiRX8SxBvVcAnCYUWafmU7BkLuRWHuwjo6UGJ6 mbOQ== X-Forwarded-Encrypted: i=1; AJvYcCUY3zMApuDGyZabQlFK2XlCfu/1/pxzgN0d/EdLUIr+PEguyA2ohpoHqqBRBhIeyfZ5YLv9QfstN14QiIQk7cuGcHBRSZS/lLEpMheM X-Gm-Message-State: AOJu0Yz34m6ue2NSHg3iKLqhAKOAxHP3++2ROm2QZgHPbo4UhxAkZoC2 vGiPEn8uo2hxlPUZuKr89AfpDb7z04R9/Y3tM9inidk7oTRPetGvxnDoUdFWIZs= X-Google-Smtp-Source: AGHT+IE62T9UYgoFVIkcGb7YcoXTufdF8VzWoHsKdhWNi0ZH/AY4EA/1XwSY1gXhI3Nz4aPyj1TOGA== X-Received: by 2002:a05:620a:1981:b0:7a1:dbf6:f762 with SMTP id af79cd13be357-7a1e524c6d3mr1039593585a.20.1722351752925; Tue, 30 Jul 2024 08:02:32 -0700 (PDT) Received: from soleen.c.googlers.com.com (197.5.86.34.bc.googleusercontent.com. [34.86.5.197]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d73efffdsm645934285a.69.2024.07.30.08.02.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 08:02:32 -0700 (PDT) From: Pasha Tatashin To: akpm@linux-foundation.org, jpoimboe@kernel.org, pasha.tatashin@soleen.com, kent.overstreet@linux.dev, peterz@infradead.org, nphamcs@gmail.com, cerasuolodomenico@gmail.com, surenb@google.com, lizhijian@fujitsu.com, willy@infradead.org, shakeel.butt@linux.dev, vbabka@suse.cz, ziy@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, yosryahmed@google.com Subject: [PATCH v6 2/3] vmstat: Kernel stack usage histogram Date: Tue, 30 Jul 2024 15:01:57 +0000 Message-ID: <20240730150158.832783-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog In-Reply-To: <20240730150158.832783-1-pasha.tatashin@soleen.com> References: <20240730150158.832783-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As part of the dynamic kernel stack project, we need to know the amount of data that can be saved by reducing the default kernel stack size [1]. Provide a kernel stack usage histogram to aid in optimizing kernel stack sizes and minimizing memory waste in large-scale environments. The histogram divides stack usage into power-of-two buckets and reports the results in /proc/vmstat. This information is especially valuable in environments with millions of machines, where even small optimizations can have a significant impact. The histogram data is presented in /proc/vmstat with entries like "kstack_1k", "kstack_2k", and so on, indicating the number of threads that exited with stack usage falling within each respective bucket. Example outputs: Intel: $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 ARM with 64K page_size: $ grep kstack /proc/vmstat kstack_1k 1 kstack_2k 340 kstack_4k 25212 kstack_8k 1659 kstack_16k 0 kstack_32k 0 kstack_64k 0 Note: once the dynamic kernel stack is implemented it will depend on the implementation the usability of this feature: On hardware that supports faults on kernel stacks, we will have other metrics that show the total number of pages allocated for stacks. On hardware where faults are not supported, we will most likely have some optimization where only some threads are extended, and for those, these metrics will still be very useful. [1] https://lwn.net/Articles/974367 Signed-off-by: Pasha Tatashin Reviewed-by: Kent Overstreet Acked-by: Shakeel Butt --- include/linux/vm_event_item.h | 24 ++++++++++++++++++++++ kernel/exit.c | 38 +++++++++++++++++++++++++++++++++++ mm/vmstat.c | 24 ++++++++++++++++++++++ 3 files changed, 86 insertions(+) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 747943bc8cc2..37ad1c16367a 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -154,6 +154,30 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, VMA_LOCK_RETRY, VMA_LOCK_MISS, #endif +#ifdef CONFIG_DEBUG_STACK_USAGE + KSTACK_1K, +#if THREAD_SIZE > 1024 + KSTACK_2K, +#endif +#if THREAD_SIZE > 2048 + KSTACK_4K, +#endif +#if THREAD_SIZE > 4096 + KSTACK_8K, +#endif +#if THREAD_SIZE > 8192 + KSTACK_16K, +#endif +#if THREAD_SIZE > 16384 + KSTACK_32K, +#endif +#if THREAD_SIZE > 32768 + KSTACK_64K, +#endif +#if THREAD_SIZE > 65536 + KSTACK_REST, +#endif +#endif /* CONFIG_DEBUG_STACK_USAGE */ NR_VM_EVENT_ITEMS }; =20 diff --git a/kernel/exit.c b/kernel/exit.c index 7430852a8571..64bfc2bae55b 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -778,6 +778,43 @@ static void exit_notify(struct task_struct *tsk, int g= roup_dead) } =20 #ifdef CONFIG_DEBUG_STACK_USAGE +/* Count the maximum pages reached in kernel stacks */ +static inline void kstack_histogram(unsigned long used_stack) +{ +#ifdef CONFIG_VM_EVENT_COUNTERS + if (used_stack <=3D 1024) + count_vm_event(KSTACK_1K); +#if THREAD_SIZE > 1024 + else if (used_stack <=3D 2048) + count_vm_event(KSTACK_2K); +#endif +#if THREAD_SIZE > 2048 + else if (used_stack <=3D 4096) + count_vm_event(KSTACK_4K); +#endif +#if THREAD_SIZE > 4096 + else if (used_stack <=3D 8192) + count_vm_event(KSTACK_8K); +#endif +#if THREAD_SIZE > 8192 + else if (used_stack <=3D 16384) + count_vm_event(KSTACK_16K); +#endif +#if THREAD_SIZE > 16384 + else if (used_stack <=3D 32768) + count_vm_event(KSTACK_32K); +#endif +#if THREAD_SIZE > 32768 + else if (used_stack <=3D 65536) + count_vm_event(KSTACK_64K); +#endif +#if THREAD_SIZE > 65536 + else + count_vm_event(KSTACK_REST); +#endif +#endif /* CONFIG_VM_EVENT_COUNTERS */ +} + static void check_stack_usage(void) { static DEFINE_SPINLOCK(low_water_lock); @@ -785,6 +822,7 @@ static void check_stack_usage(void) unsigned long free; =20 free =3D stack_not_used(current); + kstack_histogram(THREAD_SIZE - free); =20 if (free >=3D lowest_to_date) return; diff --git a/mm/vmstat.c b/mm/vmstat.c index 04a1cb6cc636..c7d52a9660c3 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1417,6 +1417,30 @@ const char * const vmstat_text[] =3D { "vma_lock_retry", "vma_lock_miss", #endif +#ifdef CONFIG_DEBUG_STACK_USAGE + "kstack_1k", +#if THREAD_SIZE > 1024 + "kstack_2k", +#endif +#if THREAD_SIZE > 2048 + "kstack_4k", +#endif +#if THREAD_SIZE > 4096 + "kstack_8k", +#endif +#if THREAD_SIZE > 8192 + "kstack_16k", +#endif +#if THREAD_SIZE > 16384 + "kstack_32k", +#endif +#if THREAD_SIZE > 32768 + "kstack_64k", +#endif +#if THREAD_SIZE > 65536 + "kstack_rest", +#endif +#endif #endif /* CONFIG_VM_EVENT_COUNTERS || CONFIG_MEMCG */ }; #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA || CONFIG_MEMCG */ --=20 2.46.0.rc1.232.g9752f9e123-goog From nobody Thu Dec 18 05:58:56 2025 Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4267A1A0B1B for ; Tue, 30 Jul 2024 15:02:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722351756; cv=none; b=G+L7ypW0pizzm8dFJZA2BspIXEV2AieqLFB3I3U5raqYPt4WpPTnE0fStJcJonBu2UGrAMnK/KA9PLotZ2L7qRhN3x4nxGChHDori2uEZ0VmyTHQnbRtcyTzqRQnG5IoxVGPALUpVMeQQc2FsAWiK6/SB0Lnb6iLz2uQHCDKRwY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722351756; c=relaxed/simple; bh=XPaZ5Zxo4tsgFR1aV1xHq2o1MzcHaPH9GzbTioUgVrg=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZY4hjPtD1sqny+TZ2u0DouEem7jT2m1xkFotSm6GJXkS/6UmLjpZdlTiBejo4FI4F8QdpXUkVKtX4aIXDwSlHfYTm2/boRLUc7GljxTzAPrBMaa3RMpusQ40b2bBDGB08bd5rjYviazQThoPaJ+ZX8WWCCJ4iYsGUYdmIK9iNgQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=bMbCxGXy; arc=none smtp.client-ip=209.85.222.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="bMbCxGXy" Received: by mail-qk1-f174.google.com with SMTP id af79cd13be357-7a1dcc7bc05so292582885a.0 for ; Tue, 30 Jul 2024 08:02:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1722351754; x=1722956554; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=2aHnBGDy68xwYXXDjrfSbMfL0AOPRRbHUN6FtVmc8lU=; b=bMbCxGXym91Bof0Au7qepUzb/x/WbP2loWNOp5b32XOcjhiyQLPjmvNfrs+HWrUFac JGPLTdqZMyXg9LPp+c4wzoQ+uU0bsDNIQquVY1yRhVijuZj2j8W7dwKsHdzpPRlJW4hA /77p+c5QGvRFFV77/q8KpraQATez48kHpayDTu4zkQhqD3D1SQbY/FbW374K4WXoGxcW TJroU8LBQlxZiiXP7vhLQJIGQCNDdfCwPrTLMFy3uaPbGx/au43bzHofCVLLwiD/Mz4D nXGlKiJbnag4/KmRoETXuXoUFH+rzME5V05GXrOyqrS5CT4PCUiiD+Je5K3lBeTQyMiz CWzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722351754; x=1722956554; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2aHnBGDy68xwYXXDjrfSbMfL0AOPRRbHUN6FtVmc8lU=; b=DTzEI30tXmFK0IFJKkssafMfoGdUZP0bjHEaWRq0/ehAEwyAyJ3nuIQF93s4W2hn54 Y8+HpLevrMh82LLjrqrQJbNSBDgIP7VW1M+V6QCjhwUB3mifl4gTFLHCzvd1++EteDhH p3Pjd0jMkxUDZSmpcNH+oDKrzTTv8IQJeYF0l5NkpIQ9XcRsxbYPLcn6/S4GpQZJW9RN bAFbF1amzMx60yxpIr0rWyiwTfslup7oYtQvoNv3p9TZ1oMVgJDmB8jSa5JNX1QWmtRk 9aoqdvvNMZNz5hmgdwYCEF34VewV1kQoT6ckSteQwgSQ8M0iaq0jp1RnJ8l3vdq1zzRn vRWA== X-Forwarded-Encrypted: i=1; AJvYcCX59GJo0RgyRZ08vM3Ih/o91zJZ6a5CqGAyWBzhkClul2/I479Y6bCUadnlY3a+hJID5UVAhVWZlfjmmAAcOYlbrPk+yyZhlZIxDnzv X-Gm-Message-State: AOJu0Ywl1F1iA5HhQt5ygYN7aaclTTSWsronyPTsHIfo8jtVrZ4RMxtu JAAanmHmkiFdxmNJ21RxrPtozeVov3qWiSYUMHVZJwkyjpbha9Am8iOEAN95hD8= X-Google-Smtp-Source: AGHT+IFxLC8+/Z/VDyAUb7tf0WUqsksRLJR+UaYeagUb+QXKem8cCoH3WlCYwUUUbTE/PYYdq//rKQ== X-Received: by 2002:a05:620a:269a:b0:7a1:d08d:e2fa with SMTP id af79cd13be357-7a1e5229e64mr1516298285a.1.1722351754104; Tue, 30 Jul 2024 08:02:34 -0700 (PDT) Received: from soleen.c.googlers.com.com (197.5.86.34.bc.googleusercontent.com. [34.86.5.197]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d73efffdsm645934285a.69.2024.07.30.08.02.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 08:02:33 -0700 (PDT) From: Pasha Tatashin To: akpm@linux-foundation.org, jpoimboe@kernel.org, pasha.tatashin@soleen.com, kent.overstreet@linux.dev, peterz@infradead.org, nphamcs@gmail.com, cerasuolodomenico@gmail.com, surenb@google.com, lizhijian@fujitsu.com, willy@infradead.org, shakeel.butt@linux.dev, vbabka@suse.cz, ziy@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, yosryahmed@google.com Subject: [PATCH v6 3/3] task_stack: uninline stack_not_used Date: Tue, 30 Jul 2024 15:01:58 +0000 Message-ID: <20240730150158.832783-4-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog In-Reply-To: <20240730150158.832783-1-pasha.tatashin@soleen.com> References: <20240730150158.832783-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Given that stack_not_used() is not performance critical function uninline it. Signed-off-by: Pasha Tatashin Acked-by: Shakeel Butt --- include/linux/sched/task_stack.h | 18 +++--------------- kernel/exit.c | 19 +++++++++++++++++++ kernel/sched/core.c | 4 +--- 3 files changed, 23 insertions(+), 18 deletions(-) diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_st= ack.h index ccd72b978e1f..bf10bdb487dd 100644 --- a/include/linux/sched/task_stack.h +++ b/include/linux/sched/task_stack.h @@ -95,23 +95,11 @@ static inline int object_is_on_stack(const void *obj) extern void thread_stack_cache_init(void); =20 #ifdef CONFIG_DEBUG_STACK_USAGE +unsigned long stack_not_used(struct task_struct *p); +#else static inline unsigned long stack_not_used(struct task_struct *p) { - unsigned long *n =3D end_of_stack(p); - - do { /* Skip over canary */ -# ifdef CONFIG_STACK_GROWSUP - n--; -# else - n++; -# endif - } while (!*n); - -# ifdef CONFIG_STACK_GROWSUP - return (unsigned long)end_of_stack(p) - (unsigned long)n; -# else - return (unsigned long)n - (unsigned long)end_of_stack(p); -# endif + return 0; } #endif extern void set_task_stack_end_magic(struct task_struct *tsk); diff --git a/kernel/exit.c b/kernel/exit.c index 64bfc2bae55b..45085a0e7c16 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -778,6 +778,25 @@ static void exit_notify(struct task_struct *tsk, int g= roup_dead) } =20 #ifdef CONFIG_DEBUG_STACK_USAGE +unsigned long stack_not_used(struct task_struct *p) +{ + unsigned long *n =3D end_of_stack(p); + + do { /* Skip over canary */ +# ifdef CONFIG_STACK_GROWSUP + n--; +# else + n++; +# endif + } while (!*n); + +# ifdef CONFIG_STACK_GROWSUP + return (unsigned long)end_of_stack(p) - (unsigned long)n; +# else + return (unsigned long)n - (unsigned long)end_of_stack(p); +# endif +} + /* Count the maximum pages reached in kernel stacks */ static inline void kstack_histogram(unsigned long used_stack) { diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a9f655025607..b275f4f27e3c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7405,7 +7405,7 @@ EXPORT_SYMBOL(io_schedule); =20 void sched_show_task(struct task_struct *p) { - unsigned long free =3D 0; + unsigned long free; int ppid; =20 if (!try_get_task_stack(p)) @@ -7415,9 +7415,7 @@ void sched_show_task(struct task_struct *p) =20 if (task_is_running(p)) pr_cont(" running task "); -#ifdef CONFIG_DEBUG_STACK_USAGE free =3D stack_not_used(p); -#endif ppid =3D 0; rcu_read_lock(); if (pid_alive(p)) --=20 2.46.0.rc1.232.g9752f9e123-goog