From nobody Sun Feb 8 06:08:38 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37A2ECDB482 for ; Mon, 16 Oct 2023 05:30:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231646AbjJPFaf (ORCPT ); Mon, 16 Oct 2023 01:30:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231561AbjJPFaa (ORCPT ); Mon, 16 Oct 2023 01:30:30 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92274E0 for ; Sun, 15 Oct 2023 22:30:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697434227; x=1728970227; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Td3CNuYVe/DvxEXYqqfRgCRBE8gOU4QwgmItjh0k3pc=; b=Kh1RC1FG0a8m+brU8tGlY3EZmzgU+pWCKmKF3VL0FRWMZ3CKtZ93cVot egoDiMIU0+06bBL799dwV1ibMF2dMp/eiSLClpVSfftyd+rYvtsCCfiA8 /ESAh3hBEaD7yEbv2K7rcSms4EJAEkOP7UI6RZuxbd7wh6TI7GWIq4j7i kcrdTcjPcXqLoXk0fBnLo1CeLBJpZVCxzTBq8UjhOtq8wWLr/edLBoFIA AcY9BfVbgaSPMHAjjgum0XO7cpgUhObAf7MnByr6R3Wek8YDwskZXtuL6 jM237bJLtayRe/7y4HkgL3Fn8yr7Q7w7OLC9GYB4GIV6WXgtZSidYdQbf A==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="389307964" X-IronPort-AV: E=Sophos;i="6.03,228,1694761200"; d="scan'208";a="389307964" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2023 22:30:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="899356650" X-IronPort-AV: E=Sophos;i="6.03,228,1694761200"; d="scan'208";a="899356650" Received: from yhuang6-mobl2.sh.intel.com ([10.238.6.133]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2023 22:28:25 -0700 From: Huang Ying To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Arjan Van De Ven , Huang Ying , Sudeep Holla , Mel Gorman , Vlastimil Babka , David Hildenbrand , Johannes Weiner , Dave Hansen , Michal Hocko , Pavel Tatashin , Matthew Wilcox , Christoph Lameter Subject: [PATCH -V3 2/9] cacheinfo: calculate size of per-CPU data cache slice Date: Mon, 16 Oct 2023 13:29:55 +0800 Message-Id: <20231016053002.756205-3-ying.huang@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231016053002.756205-1-ying.huang@intel.com> References: <20231016053002.756205-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This can be used to estimate the size of the data cache slice that can be used by one CPU under ideal circumstances. Both DATA caches and UNIFIED caches are used in calculation. So, the users need to consider the impact of the code cache usage. Because the cache inclusive/non-inclusive information isn't available now, we just use the size of the per-CPU slice of LLC to make the result more predictable across architectures. This may be improved when more cache information is available in the future. A brute-force algorithm to iterate all online CPUs is used to avoid to allocate an extra cpumask, especially in offline callback. Signed-off-by: "Huang, Ying" Cc: Sudeep Holla Cc: Andrew Morton Cc: Mel Gorman Cc: Vlastimil Babka Cc: David Hildenbrand Cc: Johannes Weiner Cc: Dave Hansen Cc: Michal Hocko Cc: Pavel Tatashin Cc: Matthew Wilcox Cc: Christoph Lameter Acked-by: Mel Gorman --- drivers/base/cacheinfo.c | 49 ++++++++++++++++++++++++++++++++++++++- include/linux/cacheinfo.h | 1 + 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index cbae8be1fe52..585c66fce9d9 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -898,6 +898,48 @@ static int cache_add_dev(unsigned int cpu) return rc; } =20 +/* + * Calculate the size of the per-CPU data cache slice. This can be + * used to estimate the size of the data cache slice that can be used + * by one CPU under ideal circumstances. UNIFIED caches are counted + * in addition to DATA caches. So, please consider code cache usage + * when use the result. + * + * Because the cache inclusive/non-inclusive information isn't + * available, we just use the size of the per-CPU slice of LLC to make + * the result more predictable across architectures. + */ +static void update_per_cpu_data_slice_size_cpu(unsigned int cpu) +{ + struct cpu_cacheinfo *ci; + struct cacheinfo *llc; + unsigned int nr_shared; + + if (!last_level_cache_is_valid(cpu)) + return; + + ci =3D ci_cacheinfo(cpu); + llc =3D per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1); + + if (llc->type !=3D CACHE_TYPE_DATA && llc->type !=3D CACHE_TYPE_UNIFIED) + return; + + nr_shared =3D cpumask_weight(&llc->shared_cpu_map); + if (nr_shared) + ci->per_cpu_data_slice_size =3D llc->size / nr_shared; +} + +static void update_per_cpu_data_slice_size(bool cpu_online, unsigned int c= pu) +{ + unsigned int icpu; + + for_each_online_cpu(icpu) { + if (!cpu_online && icpu =3D=3D cpu) + continue; + update_per_cpu_data_slice_size_cpu(icpu); + } +} + static int cacheinfo_cpu_online(unsigned int cpu) { int rc =3D detect_cache_attributes(cpu); @@ -906,7 +948,11 @@ static int cacheinfo_cpu_online(unsigned int cpu) return rc; rc =3D cache_add_dev(cpu); if (rc) - free_cache_attributes(cpu); + goto err; + update_per_cpu_data_slice_size(true, cpu); + return 0; +err: + free_cache_attributes(cpu); return rc; } =20 @@ -916,6 +962,7 @@ static int cacheinfo_cpu_pre_down(unsigned int cpu) cpu_cache_sysfs_exit(cpu); =20 free_cache_attributes(cpu); + update_per_cpu_data_slice_size(false, cpu); return 0; } =20 diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h index a5cfd44fab45..d504eb4b49ab 100644 --- a/include/linux/cacheinfo.h +++ b/include/linux/cacheinfo.h @@ -73,6 +73,7 @@ struct cacheinfo { =20 struct cpu_cacheinfo { struct cacheinfo *info_list; + unsigned int per_cpu_data_slice_size; unsigned int num_levels; unsigned int num_leaves; bool cpu_map_populated; --=20 2.39.2