From nobody Tue Dec 23 22:06:37 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E2B77D410; Tue, 30 Jan 2024 22:20:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653246; cv=none; b=QtCeldqiBxdCRsEK29Id1ZOTDfG7RdBzlfZRGQCElqZRgsCh7VfbKURW+dVfpgu4IQvJmU89nHnCUGObxWUBebnMQIvsmGT9UhMwKSNkOinSCci0NmD3SBTqJGxkcl8qe4UtBXMpXC6QOWlhTkJXVRzrvduj03PbwdX0YcFzY24= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653246; c=relaxed/simple; bh=At0RKTLoBHv44HG3Uy9zivC4vUPHbbq8Oy87lxfsgWQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NvE7Xph6PYaRX0N+K/FY9Rq1rZRYMW1OJE7jrAy9Mqy0CgnYopP3dxr/lJ8IHVbU2XS7D2DrKLS3G6Ky1fdR/ivVi7+keSgvyESPLOp4Ic0qfQupyF1Z0ONJaB64kES/AZtUrmIE+mMNbMu+aqyK8+UVgCISmk+ISFAAtve+SxQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ejlFTTY7; arc=none smtp.client-ip=192.55.52.115 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ejlFTTY7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706653245; x=1738189245; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=At0RKTLoBHv44HG3Uy9zivC4vUPHbbq8Oy87lxfsgWQ=; b=ejlFTTY7YqCsg8cB+Cqd3xY4NnYlnfsLlT6LdrI7U6tvFIlg8HoJ4Vu/ Ayz3lZeTXD7pCaUL9mDIw8ehB41yytUgFjG5NcKUMi3+hmt7MH4s1ENoj bWk8ly4/vtX4dn04n8sqOH9dmqlAmayhqr7UpJ3rvIHuI3Wbuqzx4fok8 ca0d+Wq31N0AlIfsy1wzkyywB4HWfVCpT8g0bdgmFwPqm6NXQCBXoVN4y xwpvN2midCK4a5OJQRHH4+yCf7yX8aGQ3tCARM1vPvgnltLGoAUDQ5Nvs 5lZJzw0Ho7lvWON3ZY16yecVtv+4IhyF4iKc+pV5QB9EWd5VwGWPybf6U Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="403041708" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="403041708" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="1119412838" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="1119412838" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:42 -0800 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , Drew Fustini , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v15-RFC 1/8] x86/resctrl: Split the RDT_RESOURCE_L3 resource Date: Tue, 30 Jan 2024 14:20:27 -0800 Message-ID: <20240130222034.37181-2-tony.luck@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240130222034.37181-1-tony.luck@intel.com> References: <20240126223837.21835-1-tony.luck@intel.com> <20240130222034.37181-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The RDT_RESOURCE_L3 is unique in that it is used for both monitoring an control functions. This made sense while both uses had the same scope. But systems with Sub-NUMA clustering enabled do not follow this pattern. Create a new resource: RDT_RESOURCE_L3_MON ready to take over the monitoring functions. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/core.c | 10 ++++++++++ 2 files changed, 11 insertions(+) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 52e7e7deee10..c6051bc70e96 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -429,6 +429,7 @@ DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); extern struct dentry *debugfs_resctrl; =20 enum resctrl_res_level { + RDT_RESOURCE_L3_MON, RDT_RESOURCE_L3, RDT_RESOURCE_L2, RDT_RESOURCE_MBA, diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index aa9810a64258..c50f55d7790e 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -60,6 +60,16 @@ mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, #define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctrl.dom= ains) =20 struct rdt_hw_resource rdt_resources_all[] =3D { + [RDT_RESOURCE_L3_MON] =3D + { + .r_resctrl =3D { + .rid =3D RDT_RESOURCE_L3_MON, + .name =3D "L3", + .cache_level =3D 3, + .domains =3D domain_init(RDT_RESOURCE_L3_MON), + .fflags =3D RFTYPE_RES_CACHE, + }, + }, [RDT_RESOURCE_L3] =3D { .r_resctrl =3D { --=20 2.43.0 From nobody Tue Dec 23 22:06:37 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDFA37D41D; Tue, 30 Jan 2024 22:20:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653247; cv=none; b=fYVB5uvxmOnbhJdIjnJ0zkEpCXabcF8W1cP586VKDKcf580PHZ9hCUb6OBwahZo9gpH98OiukJzjnmHMRvpTFZXCKeuCUIQKovo0TkOq632FSfdxJqlCn8On7y8JJ7vHnkBoQVis3xY++YlyuhBnyNantS0LzhK8psjWyOafIOE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653247; c=relaxed/simple; bh=1TyiQiWxXMk7SQhe5/husjTz7Y/m94kkQYw6BLrbMtY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uMCOMpA+rWH25U4N9t6kP0mDCmalemxz2YyE80Ee3j9AstXgDq9TOyKzus0eMyYb3xmILuT5XvIEHNAIsF6iQXHNmJyx30xjJPq+6G1pgPauBFI5vtqt76Wjk/rJFm8JRPmQ6DzxqMzqs/txugzGcQulXOLUhvE2ncRBHaNynSo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=m1scApdF; arc=none smtp.client-ip=192.55.52.115 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="m1scApdF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706653246; x=1738189246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1TyiQiWxXMk7SQhe5/husjTz7Y/m94kkQYw6BLrbMtY=; b=m1scApdFzPtMOIUViesIMw3FbP9TVb0SWjgqNX+xA6a328eNdDK68Keu 9MEmID5eBcCzy7uNYnxksaLTRfDBTquMrFwGdAKFgMrNTDW4eEnOOU15N ERiWcHNeIX1q1IJpFETXbI4CdJY0qfzHYN5ck59YsY7zAii0Vf8gHh9S3 EBvqwz8unhhnVHsrd0G7vG02O7PuxPxwuaxOWebf55s41NgcxoU5A5ZDJ XRS4sTLtqy4nKfLoVkm/Tj9Mikh1QuLqIH/fLDSrRzibnrWFR0l8cojn1 zEL9wWdZUYVBCDG1jJf1zwHAWaIl6owghhIqCBAIhWwoCsBj3zLqqqju4 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="403041731" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="403041731" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="1119412841" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="1119412841" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:42 -0800 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , Drew Fustini , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v15-RFC 2/8] x86/resctrl: Move all monitoring functions to RDT_RESOURCE_L3_MON Date: Tue, 30 Jan 2024 14:20:28 -0800 Message-ID: <20240130222034.37181-3-tony.luck@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240130222034.37181-1-tony.luck@intel.com> References: <20240126223837.21835-1-tony.luck@intel.com> <20240130222034.37181-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Switch over all places that setup and use monitoring funtions to use the new resource structure. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/resctrl/core.c | 6 ++++-- arch/x86/kernel/cpu/resctrl/monitor.c | 12 ++++-------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +- 3 files changed, 9 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index c50f55d7790e..0828575c3e13 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -591,11 +591,13 @@ static void domain_remove_cpu(int cpu, struct rdt_res= ource *r) return; } =20 - if (r =3D=3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { + if (r =3D=3D &rdt_resources_all[RDT_RESOURCE_L3_MON].r_resctrl) { if (is_mbm_enabled() && cpu =3D=3D d->mbm_work_cpu) { cancel_delayed_work(&d->mbm_over); mbm_setup_overflow_handler(d, 0); } + } + if (r =3D=3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { if (is_llc_occupancy_enabled() && cpu =3D=3D d->cqm_work_cpu && has_busy_rmid(r, d)) { cancel_delayed_work(&d->cqm_limbo); @@ -826,7 +828,7 @@ static __init bool get_rdt_alloc_resources(void) =20 static __init bool get_rdt_mon_resources(void) { - struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3_MON].r_resc= trl; =20 if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC)) rdt_mon_features |=3D (1 << QOS_L3_OCCUP_EVENT_ID); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index 3a6c069614eb..080cad0d7288 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -268,7 +268,7 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, stru= ct rdt_domain *d, */ void __check_limbo(struct rdt_domain *d, bool force_free) { - struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3_MON].r_resc= trl; struct rmid_entry *entry; u32 crmid =3D 1, nrmid; bool rmid_dirty; @@ -333,7 +333,7 @@ int alloc_rmid(void) =20 static void add_rmid_to_limbo(struct rmid_entry *entry) { - struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3_MON].r_resc= trl; struct rdt_domain *d; int cpu, err; u64 val =3D 0; @@ -623,7 +623,7 @@ void cqm_handle_limbo(struct work_struct *work) =20 mutex_lock(&rdtgroup_mutex); =20 - r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + r =3D &rdt_resources_all[RDT_RESOURCE_L3_MON].r_resctrl; d =3D container_of(work, struct rdt_domain, cqm_limbo.work); =20 __check_limbo(d, false); @@ -659,7 +659,7 @@ void mbm_handle_overflow(struct work_struct *work) if (!static_branch_likely(&rdt_mon_enable_key)) goto out_unlock; =20 - r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + r =3D &rdt_resources_all[RDT_RESOURCE_L3_MON].r_resctrl; d =3D container_of(work, struct rdt_domain, mbm_over.work); =20 list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { @@ -736,10 +736,6 @@ static struct mon_evt mbm_local_event =3D { =20 /* * Initialize the event list for the resource. - * - * Note that MBM events are also part of RDT_RESOURCE_L3 resource - * because as per the SDM the total and local memory bandwidth - * are enumerated as part of L3 monitoring. */ static void l3_mon_evt_init(struct rdt_resource *r) { diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index aa24343f1d23..9ee3a9906781 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2644,7 +2644,7 @@ static int rdt_get_tree(struct fs_context *fc) static_branch_enable_cpuslocked(&rdt_enable_key); =20 if (is_mbm_enabled()) { - r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + r =3D &rdt_resources_all[RDT_RESOURCE_L3_MON].r_resctrl; list_for_each_entry(dom, &r->domains, list) mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL); } --=20 2.43.0 From nobody Tue Dec 23 22:06:37 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AFF287866E; Tue, 30 Jan 2024 22:20:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653249; cv=none; b=R5IQ6JoeY8TZo7Cq6cidoP0xMeqopDGOjRkdPzcw2GBl7vpHpGoOj5OOwn2wanhFTc5SwQ82bqKEtVHW5AgTUTYaF/U9TnsBx+T2UHp5wbwNtrDHsg+DWtKOiVvE+lQASlpVhxJWOg4hsl9C9a9BP9wiTKeE0nn6q/NLEDX08Xo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653249; c=relaxed/simple; bh=Mg7KR2u6rJ9exJMnzpHcf9Nmdz63x0v06nGiR1d+ub4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nEh0UMRCzJqK0J30JTW8RAVoqyDV+JpYh/OrPyiNobNWIL6GF5nyg2NM7RKLCl5BFyZCR0/E3s3VPka7VCrRVG6q7GzbZlCvgRgWZ8uHA/IEPzNPMR3E2G1C3q4rHybiMhL1fGH/ZdA7LznTgIAQ46sdbU+bRaF3SacPsNP5lbs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=IvivmxG7; arc=none smtp.client-ip=192.55.52.115 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="IvivmxG7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706653247; x=1738189247; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Mg7KR2u6rJ9exJMnzpHcf9Nmdz63x0v06nGiR1d+ub4=; b=IvivmxG7QVN636fhBIwHINeT+OtvzwmobUvJtyQjEiVCgjGRBxKUMM2D SL4Vmfr7tg0aSeOSsoI+m7pyKJfQMtL3NvRKq/RCVyHETmszqs4yXhmom 3K/yFaj1vNdEr9LUNTHDWjIC0mfDlrCnwbvKvloRxoY+3LhzDk+1pBoMr q+E9/Wag24lgejKD9LFoVUWowA1/6L20SGYUAn+v3LKPoHv5Rw1Wr+PU6 NwpnDIBtSHUjLzP7LDkGlk+WvJfaUughtnSCI+vkkM6quQIK6xVDQfu2s TxN/sDkoYbmlUF08eZckxNlsp7rChnuiTa50zuBggF49MUXOq6/y3+08R g==; X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="403041733" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="403041733" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="1119412844" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="1119412844" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:42 -0800 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , Drew Fustini , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v15-RFC 3/8] x86/resctrl: Prepare for non-cache-scoped resources Date: Tue, 30 Jan 2024 14:20:29 -0800 Message-ID: <20240130222034.37181-4-tony.luck@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240130222034.37181-1-tony.luck@intel.com> References: <20240126223837.21835-1-tony.luck@intel.com> <20240130222034.37181-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Not all resources are scoped in line with some level of hardware cache. Prepare by renaming the "cache_level" field to "scope" and change the type to an enum to ease adding new scopes. Signed-off-by: Tony Luck --- include/linux/resctrl.h | 9 +++++++-- arch/x86/kernel/cpu/resctrl/core.c | 14 +++++++------- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +- 4 files changed, 16 insertions(+), 11 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 66942d7fba7f..2155dc15e636 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -144,13 +144,18 @@ struct resctrl_membw { struct rdt_parse_data; struct resctrl_schema; =20 +enum resctrl_scope { + RESCTRL_L2_CACHE =3D 2, + RESCTRL_L3_CACHE =3D 3, +}; + /** * struct rdt_resource - attributes of a resctrl resource * @rid: The index of the resource * @alloc_capable: Is allocation available on this machine * @mon_capable: Is monitor feature available on this machine * @num_rmid: Number of RMIDs available - * @cache_level: Which cache level defines scope of this resource + * @scope: Hardware scope for this resource * @cache: Cache allocation related data * @membw: If the component has bandwidth controls, their properties. * @domains: All domains for this resource @@ -168,7 +173,7 @@ struct rdt_resource { bool alloc_capable; bool mon_capable; int num_rmid; - int cache_level; + enum resctrl_scope scope; struct resctrl_cache cache; struct resctrl_membw membw; struct list_head domains; diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 0828575c3e13..d89dce63397b 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -65,7 +65,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L3_MON, .name =3D "L3", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_L3_MON), .fflags =3D RFTYPE_RES_CACHE, }, @@ -75,7 +75,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L3, .name =3D "L3", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_L3), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", @@ -89,7 +89,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L2, .name =3D "L2", - .cache_level =3D 2, + .scope =3D RESCTRL_L2_CACHE, .domains =3D domain_init(RDT_RESOURCE_L2), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", @@ -103,7 +103,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_MBA, .name =3D "MB", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_MBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", @@ -115,7 +115,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_SMBA, .name =3D "SMBA", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_SMBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", @@ -514,7 +514,7 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct r= dt_hw_domain *hw_dom) */ static void domain_add_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->cache_level); + int id =3D get_cpu_cacheinfo_id(cpu, r->scope); struct list_head *add_pos =3D NULL; struct rdt_hw_domain *hw_dom; struct rdt_domain *d; @@ -564,7 +564,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource= *r) =20 static void domain_remove_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->cache_level); + int id =3D get_cpu_cacheinfo_id(cpu, r->scope); struct rdt_hw_domain *hw_dom; struct rdt_domain *d; =20 diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index 8f559eeae08e..6a72fb627aa5 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -311,7 +311,7 @@ static int pseudo_lock_region_init(struct pseudo_lock_r= egion *plr) plr->size =3D rdtgroup_cbm_to_size(plr->s->res, plr->d, plr->cbm); =20 for (i =3D 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level =3D=3D plr->s->res->cache_level) { + if (ci->info_list[i].level =3D=3D plr->s->res->scope) { plr->line_size =3D ci->info_list[i].coherency_line_size; return 0; } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 9ee3a9906781..eff9d87547c9 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1416,7 +1416,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource= *r, num_b =3D bitmap_weight(&cbm, r->cache.cbm_len); ci =3D get_cpu_cacheinfo(cpumask_any(&d->cpu_mask)); for (i =3D 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level =3D=3D r->cache_level) { + if (ci->info_list[i].level =3D=3D r->scope) { size =3D ci->info_list[i].size / r->cache.cbm_len * num_b; break; } --=20 2.43.0 From nobody Tue Dec 23 22:06:37 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B6146F08E; Tue, 30 Jan 2024 22:20:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653248; cv=none; b=VHUmYWgHma8UjlO4fb38nqpvXLTOz9uITTqjB2wLre8W1ZIbLD2bUWWAMiWdxvoTlBcxNk4fO96G4RpoG5CDCHCMb+rlx4kGze2k3dg3NSlnItkZ3b4a/JtDdsUYp9DS3q6hZnhUBWc/Bys1XGdxveC9MMnE++aNytfwNyGyn0o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653248; c=relaxed/simple; bh=4+Sl+NwFbXMo1a6gRDqTKEVyh7WFVjJ+Uv+TU1KbyG8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sXAUvkktI2uypnrRUgt1ByoanZEvkZ1d+SKDvwGG9kmyt1HvR7ybywgAQ5zOeFymLzzuCW4RmOJ7fRs+6Rssvbzs3wYEjVB6FUh0tZW3n09dxbPr1T+l93Mr5U8SeeONkvaKA5+qvhvWWdv5Ga6Z5P49jmUbeW8cAYV/1vH7xiM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=bpwyNST5; arc=none smtp.client-ip=192.55.52.115 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bpwyNST5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706653246; x=1738189246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4+Sl+NwFbXMo1a6gRDqTKEVyh7WFVjJ+Uv+TU1KbyG8=; b=bpwyNST5RPE33+tfIJuKHNSwe9DFzAjfhV3Qpi26TiNWpLZNvDMJHOfr NM0lsThtKBG5pEQA0ex7mQzU3fSvSDOosBA51u2/ku6q6xGJnVitAnA6u HQDqmo/q/rHbynPIC8UjAaTkU7wADKd6q0grkkF9DTTywFZVnxFfuEBj5 21M7FBYa6U0JHaNOoM5+7qucv1ygdkWExO5raxX4VaQa4luXDtbopViau AkB7cm8gf9kMnIR6LaQ4InA92OKWF0hhHdtaiy6Gtj/ElRI7wi+AKP1Gg 6x3o/RaJ/o/q+DoYUOqVTGXF5UckA0CToBfX3X+sfi2gvmXHIRVQK5oJ8 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="403041757" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="403041757" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="1119412847" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="1119412847" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:42 -0800 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , Drew Fustini , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v15-RFC 4/8] x86/resctrl: Add helper function to look up domain_id from scope Date: Tue, 30 Jan 2024 14:20:30 -0800 Message-ID: <20240130222034.37181-5-tony.luck@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240130222034.37181-1-tony.luck@intel.com> References: <20240126223837.21835-1-tony.luck@intel.com> <20240130222034.37181-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Prepare for more options for scope of resources. Add some diagnostic messages if lookup fails. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/resctrl/core.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index d89dce63397b..59e6aa7abef5 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -499,6 +499,19 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct = rdt_hw_domain *hw_dom) return 0; } =20 +static int get_domain_id_from_scope(int cpu, enum resctrl_scope scope) +{ + switch (scope) { + case RESCTRL_L2_CACHE: + case RESCTRL_L3_CACHE: + return get_cpu_cacheinfo_id(cpu, scope); + default: + break; + } + + return -EINVAL; +} + /* * domain_add_cpu - Add a cpu to a resource's domain list. * @@ -514,12 +527,18 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct= rdt_hw_domain *hw_dom) */ static void domain_add_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->scope); + int id =3D get_domain_id_from_scope(cpu, r->scope); struct list_head *add_pos =3D NULL; struct rdt_hw_domain *hw_dom; struct rdt_domain *d; int err; =20 + if (id < 0) { + pr_warn_once("Can't find domain id for CPU:%d scope:%d for resource %s\n= ", + cpu, r->scope, r->name); + return; + } + d =3D rdt_find_domain(r, id, &add_pos); if (IS_ERR(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); @@ -564,10 +583,16 @@ static void domain_add_cpu(int cpu, struct rdt_resour= ce *r) =20 static void domain_remove_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->scope); + int id =3D get_domain_id_from_scope(cpu, r->scope); struct rdt_hw_domain *hw_dom; struct rdt_domain *d; =20 + if (id < 0) { + pr_warn_once("Can't find domain id for CPU:%d scope:%d for resource %s\n= ", + cpu, r->scope, r->name); + return; + } + d =3D rdt_find_domain(r, id, NULL); if (IS_ERR_OR_NULL(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); --=20 2.43.0 From nobody Tue Dec 23 22:06:37 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D49FD762DD; Tue, 30 Jan 2024 22:20:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653248; cv=none; b=VRnHvl3c7XInZj6ZjYvTy2rE3KmXMhz2oStQLdssPAGiALoNGTA/HGvFKY7LCxRpM+FpMiWAMQDr4yrzvZ/dOscV5cRO4Zv/LFSXUyjrNoVqpNYIJ6oL/fYbem7e0eBYuwLU7Djlom3a53sHazLSbm0RHrM0ZmMaewMTHyOa0oc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653248; c=relaxed/simple; bh=e1U36HnQRGJev5PeIToOOlwXKxHkepDX5Nu7sz6o/IU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=USmDrHpkYSl7TG8i0PPFWpyFpIbSX0bItAleM3bmutU4t1zVT0GBN0QBr32uQCAmGn4PqXUat+FpLUB+e+f5+mtVVQrWxcw878BLx14MpG15NNWXPGVE7n4OcVpWE89Nqo1rPeNK66wjhEpYpNzANSZMn02KatfwJDydxV0CWuo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=fvikW4Vn; arc=none smtp.client-ip=192.55.52.115 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="fvikW4Vn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706653246; x=1738189246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=e1U36HnQRGJev5PeIToOOlwXKxHkepDX5Nu7sz6o/IU=; b=fvikW4Vnk036nODM9ERbvF4Ovr4oz0omwKo3QQzYMIDVqHcm5l/FTwC0 f/m8U2mZggWS3ydP5DdBGrpBtwnwfHuawRVokehA3ieVQ00+/5wjAyh6W BBoANwsFFL0FopJZinC9o2Ull1mTrNm7L0H5jYTSZIXJiabAytXa+2zx2 Nj/YsWsQnfdrPFVqyPSwOE/eO/sixKQzHozc+3pimooQBBHJkq3D9KBTL C8dlaovg04izVXZXh2ySkwQsNX+XxgYBh0RKII88Fbl/bHkHBCXqSMKZS EdHirtWrVGSHZ4ouvNTbvsgDx2SAzgsCh+enptKbetrtAU3q9ji7ENG14 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="403041769" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="403041769" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="1119412850" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="1119412850" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , Drew Fustini , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v15-RFC 5/8] x86/resctrl: Add "NODE" as an option for resource scope Date: Tue, 30 Jan 2024 14:20:31 -0800 Message-ID: <20240130222034.37181-6-tony.luck@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240130222034.37181-1-tony.luck@intel.com> References: <20240126223837.21835-1-tony.luck@intel.com> <20240130222034.37181-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add RESCTRL_NODE to the enum, and to the helper function that looks up a domain id from a scope. There are a couple of places where the scope must be a cache scope. Add some defensive WARN_ON checks to those. Signed-off-by: Tony Luck --- include/linux/resctrl.h | 1 + arch/x86/kernel/cpu/resctrl/core.c | 3 +++ arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 ++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 3 +++ 4 files changed, 11 insertions(+) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 2155dc15e636..e3cddf3f07f8 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -147,6 +147,7 @@ struct resctrl_schema; enum resctrl_scope { RESCTRL_L2_CACHE =3D 2, RESCTRL_L3_CACHE =3D 3, + RESCTRL_NODE, }; =20 /** diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 59e6aa7abef5..b741cbf61843 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -505,6 +505,9 @@ static int get_domain_id_from_scope(int cpu, enum resct= rl_scope scope) case RESCTRL_L2_CACHE: case RESCTRL_L3_CACHE: return get_cpu_cacheinfo_id(cpu, scope); + case RESCTRL_NODE: + return cpu_to_node(cpu); + default: break; } diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index 6a72fb627aa5..2bafc73b51e2 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -292,10 +292,14 @@ static void pseudo_lock_region_clear(struct pseudo_lo= ck_region *plr) */ static int pseudo_lock_region_init(struct pseudo_lock_region *plr) { + enum resctrl_scope scope =3D plr->s->res->scope; struct cpu_cacheinfo *ci; int ret; int i; =20 + if (WARN_ON_ONCE(scope !=3D RESCTRL_L2_CACHE && scope !=3D RESCTRL_L3_CAC= HE)) + return -ENODEV; + /* Pick the first cpu we find that is associated with the cache. */ plr->cpu =3D cpumask_first(&plr->d->cpu_mask); =20 diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index eff9d87547c9..770f2bf98462 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1413,6 +1413,9 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource= *r, unsigned int size =3D 0; int num_b, i; =20 + if (WARN_ON_ONCE(r->scope !=3D RESCTRL_L2_CACHE && r->scope !=3D RESCTRL_= L3_CACHE)) + return size; + num_b =3D bitmap_weight(&cbm, r->cache.cbm_len); ci =3D get_cpu_cacheinfo(cpumask_any(&d->cpu_mask)); for (i =3D 0; i < ci->num_leaves; i++) { --=20 2.43.0 From nobody Tue Dec 23 22:06:37 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A20157868E; Tue, 30 Jan 2024 22:20:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653250; cv=none; b=A54y3mRbu7lLUezmQX6EObaVSfGmJMhO5sph6OnbYYTa4YyK0F6o6K0/PFAomwoZzHfpL16ZHLD61LUokhF+kYvChHiLQUJ2K7A9tutyj8QrOSkSTewzce49YTVFjKsY9DH7CYKorasIWvWSA1R5QtfK1RIJl8+QTApP/twknnk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653250; c=relaxed/simple; bh=1DXuLZx+xHAYhOJqniMrqiRoYluAwyfqqCgoGcnExSQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GU8KZQ8OA7M8U1ui8bjIl282ANGrdiml4a1Xh2QofQXAkiJchNomh9i8k9z62puWqzTiX/ATrXLbb6XuzXbConmMUnPguivwd2TTDrQIN4CFTeH7qNEM9qxgJW4azKbr+kp0V74uHrii1si7aREwB1pTixsFU5gksBhLuZtts5U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jTFaktfj; arc=none smtp.client-ip=192.55.52.115 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jTFaktfj" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706653248; x=1738189248; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1DXuLZx+xHAYhOJqniMrqiRoYluAwyfqqCgoGcnExSQ=; b=jTFaktfjNnjgSAFJWv/hz4XaaTACUdCmP62/tBCZ+8bovuMPXoms5cKC +oyTf2IQgQLP/F+WO77UCkN8krZT5wwe8oM/XzzKCC3YcXqPypJ3ZXVin p8E3oQTE9rKFnrJEZLfeGUT3DNB/+KHkvEetRI4uddUrs8uwpahYaAcUb gl0eoX/2dosh2xE7al6GP+GNt/Y1fyxzO+XBeWVTVFdIwyIdj0yk7CQBm 4MWqgpNuKqsNwU9SOQ46SBGOOAERtqgekXBEGUYtQR8eLagXxkmG2juk0 bqUmPOfkNX2Cn+pX49YVvxKXk9KbT5Le5AV02eowXtMYG3Xdj+d/WqIMq g==; X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="403041781" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="403041781" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="1119412853" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="1119412853" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , Drew Fustini , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v15-RFC 6/8] x86/resctrl: Introduce snc_nodes_per_l3_cache Date: Tue, 30 Jan 2024 14:20:32 -0800 Message-ID: <20240130222034.37181-7-tony.luck@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240130222034.37181-1-tony.luck@intel.com> References: <20240126223837.21835-1-tony.luck@intel.com> <20240130222034.37181-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Intel Sub-NUMA Cluster (SNC) is a feature that subdivides the CPU cores and memory controllers on a socket into two or more groups. These are presented to the operating system as NUMA nodes. This may enable some workloads to have slightly lower latency to memory as the memory controller(s) in an SNC node are electrically closer to the CPU cores on that SNC node. This cost may be offset by lower bandwidth since the memory accesses for each core can only be interleaved between the memory controllers on the same SNC node. Resctrl monitoring on an Intel system depends upon attaching RMIDs to tasks to track L3 cache occupancy and memory bandwidth. There is an MSR that controls how the RMIDs are shared between SNC nodes. The default mode divides them numerically. E.g. when there are two SNC nodes on a socket the lower number half of the RMIDs are given to the first node, the remainder to the second node. This would be difficult to use with the Linux resctrl interface as specific RMID values assigned to resctrl groups are not visible to users. The other mode divides the RMIDs and renumbers the ones on the second SNC node to start from zero. Even with this renumbering SNC mode requires several changes in resctrl behavior for correct operation. Add a global integer "snc_nodes_per_l3_cache" that shows how many SNC nodes share each L3 cache. When "snc_nodes_per_l3_cache" is "1", SNC mode is either not implemented, or not enabled. Update all places to take appropriate action when SNC mode is enabled: 1) The number of logical RMIDs per L3 cache available for use is the number of physical RMIDs divided by the number of SNC nodes. 2) Likewise the "mon_scale" value must be divided by the number of SNC nodes. 3) The RMID renumbering operates when using the value from the IA32_PQR_ASSOC MSR to count accesses by a task. When reading an RMID counter, adjust from the logical RMID to the physical RMID value for the SNC node that it wishes to read and load the adjusted value into the IA32_QM_EVTSEL MSR. 4) Divide the L3 cache between the SNC nodes. Divide the value reported in the resctrl "size" file by the number of SNC nodes because the effective amount of cache that can be allocated is reduced by that factor. 5) Disable the "-o mba_MBps" mount option in SNC mode because the monitoring is being done per SNC node, while the bandwidth allocation is still done at the L3 cache scope. Trying to use this feedback loop might result in contradictory changes to the throttling level coming from each of the SNC node bandwidth measurements. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/resctrl/internal.h | 2 ++ arch/x86/kernel/cpu/resctrl/core.c | 6 ++++++ arch/x86/kernel/cpu/resctrl/monitor.c | 16 +++++++++++++--- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 +++-- 4 files changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index c6051bc70e96..d9c6dcf30922 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -428,6 +428,8 @@ DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); =20 extern struct dentry *debugfs_resctrl; =20 +extern unsigned int snc_nodes_per_l3_cache; + enum resctrl_res_level { RDT_RESOURCE_L3_MON, RDT_RESOURCE_L3, diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index b741cbf61843..dc886d2c9a33 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -48,6 +48,12 @@ int max_name_width, max_data_width; */ bool rdt_alloc_capable; =20 +/* + * Number of SNC nodes that share each L3 cache. Default is 1 for + * systems that do not support SNC, or have SNC disabled. + */ +unsigned int snc_nodes_per_l3_cache =3D 1; + static void mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index 080cad0d7288..357919bbadbe 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -148,8 +148,18 @@ static inline struct rmid_entry *__rmid_entry(u32 rmid) =20 static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val) { + struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + int cpu =3D smp_processor_id(); + int rmid_offset =3D 0; u64 msr_val; =20 + /* + * When SNC mode is on, need to compute the offset to read the + * physical RMID counter for the node to which this CPU belongs. + */ + if (snc_nodes_per_l3_cache > 1) + rmid_offset =3D (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->num_rmi= d; + /* * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured * with a valid event code for supported resource type and the bits @@ -158,7 +168,7 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id = eventid, u64 *val) * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62) * are error bits. */ - wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid); + wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid + rmid_offset); rdmsrl(MSR_IA32_QM_CTR, msr_val); =20 if (msr_val & RMID_VAL_ERROR) @@ -757,8 +767,8 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r) int ret; =20 resctrl_rmid_realloc_limit =3D boot_cpu_data.x86_cache_size * 1024; - hw_res->mon_scale =3D boot_cpu_data.x86_cache_occ_scale; - r->num_rmid =3D boot_cpu_data.x86_cache_max_rmid + 1; + hw_res->mon_scale =3D boot_cpu_data.x86_cache_occ_scale / snc_nodes_per_l= 3_cache; + r->num_rmid =3D (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3= _cache; hw_res->mbm_width =3D MBM_CNTR_WIDTH_BASE; =20 if (mbm_offset > 0 && mbm_offset <=3D MBM_CNTR_WIDTH_OFFSET_MAX) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 770f2bf98462..e639069f871a 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1425,7 +1425,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource= *r, } } =20 - return size; + return size / snc_nodes_per_l3_cache; } =20 /* @@ -2293,7 +2293,8 @@ static bool supports_mba_mbps(void) struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl; =20 return (is_mbm_local_enabled() && - r->alloc_capable && is_mba_linear()); + r->alloc_capable && is_mba_linear() && + snc_nodes_per_l3_cache =3D=3D 1); } =20 /* --=20 2.43.0 From nobody Tue Dec 23 22:06:37 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A8307867E; Tue, 30 Jan 2024 22:20:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653250; cv=none; b=cUdrnKWSZRJFuHkgIgCwqx/Yqs2dgvXhIhvSAOFXY1JTvkveQXIs30Xpdx77fOJlpGPHnbWBud/oVvcOVDwDHv6rMFLjZqeU1A9bOnQCTIQTaUFtcJGfPhU37VdwjU3WT3TlOHgRDZGB3Qvr7mphLjGYh+FzO0nNvKT97db5XJU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653250; c=relaxed/simple; bh=29bQN1NyDHKevckTrT72bvfFGLl3zjcUVMkQBDiTwo4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qQ6MzPt5hvGW2+ZLSUEhueoRzvnfTFW0tgLKz7ix0K0tQZddliXJsNJyTm8LoCSG0aiqONzH96Vy57QohXT1SSfUsJ1v6n5+PtEz6fN2I9+M9qFmk3k9z1SomE5izc1ttI1sa1UkcjPRCMN/y5cFg8ojx9KeGpdATHIejk63TYw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=H+/u2LBG; arc=none smtp.client-ip=192.55.52.115 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="H+/u2LBG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706653248; x=1738189248; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=29bQN1NyDHKevckTrT72bvfFGLl3zjcUVMkQBDiTwo4=; b=H+/u2LBGASYounmuks7453IrNU1GiZXCMWEGNAFeNK5nrxD6pSUlr2GX W1goB6ZfyCNhJZhw4S5ewariy/gQ4oxiiFJuyRc9gNKIPvSBWzcZKDAOv K3Ew2yW3A9ApvkWzOD0YeVBNCVQMf7IunswXtCgHnHyotKImlKovwVfmp F4ffzfdKfh84lmmrUsqsUPUjDPaiAwrtv0Z9CbjXn9sxydbp1/4Kensu8 WzQIjrD0j8loXh3UYMlnqBlbJ1PUCGAC2JJeT2DwcBX3jxWBKtK6TdGWA 3ER+vsUu8CIcNjih3Mklo4cJn3FEMtpuSWnWP4Cc3BfHVW1/QXrBter4G g==; X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="403041796" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="403041796" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="1119412856" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="1119412856" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , Drew Fustini , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v15-RFC 7/8] x86/resctrl: Sub NUMA Cluster detection and enable Date: Tue, 30 Jan 2024 14:20:33 -0800 Message-ID: <20240130222034.37181-8-tony.luck@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240130222034.37181-1-tony.luck@intel.com> References: <20240126223837.21835-1-tony.luck@intel.com> <20240130222034.37181-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There isn't a simple hardware bit that indicates whether a CPU is running in Sub NUMA Cluster (SNC) mode. Infer the state by comparing the ratio of NUMA nodes to L3 cache instances. When SNC mode is detected, reconfigure the RMID counters by updating the MSR_RMID_SNC_CONFIG MSR on each socket as CPUs are seen. Update the scope of the RDT_RESOURCE_L3_MON resource to NODE. Clearing bit zero of the MSR divides the RMIDs and renumbers the ones on the second SNC node to start from zero. Signed-off-by: Tony Luck --- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/cpu/resctrl/core.c | 119 +++++++++++++++++++++++++++++ 2 files changed, 120 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index f1bd7b91b3c6..f6ba7d0397b8 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -1119,6 +1119,7 @@ #define MSR_IA32_QM_CTR 0xc8e #define MSR_IA32_PQR_ASSOC 0xc8f #define MSR_IA32_L3_CBM_BASE 0xc90 +#define MSR_RMID_SNC_CONFIG 0xca0 #define MSR_IA32_L2_CBM_BASE 0xd10 #define MSR_IA32_MBA_THRTL_BASE 0xd50 =20 diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index dc886d2c9a33..84c36e10241f 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -16,11 +16,14 @@ =20 #define pr_fmt(fmt) "resctrl: " fmt =20 +#include #include #include #include #include +#include =20 +#include #include #include #include "internal.h" @@ -651,11 +654,42 @@ static void clear_closid_rmid(int cpu) wrmsr(MSR_IA32_PQR_ASSOC, 0, 0); } =20 +/* + * The power-on reset value of MSR_RMID_SNC_CONFIG is 0x1 + * which indicates that RMIDs are configured in legacy mode. + * This mode is incompatible with Linux resctrl semantics + * as RMIDs are partitioned between SNC nodes, which requires + * a user to know which RMID is allocated to a task. + * Clearing bit 0 reconfigures the RMID counters for use + * in Sub NUMA Cluster mode. This mode is better for Linux. + * The RMID space is divided between all SNC nodes with the + * RMIDs renumbered to start from zero in each node when + * couning operations from tasks. Code to read the counters + * must adjust RMID counter numbers based on SNC node. See + * __rmid_read() for code that does this. + */ +static void snc_remap_rmids(int cpu) +{ + u64 val; + + /* Only need to enable once per package. */ + if (cpumask_first(topology_core_cpumask(cpu)) !=3D cpu) + return; + + rdmsrl(MSR_RMID_SNC_CONFIG, val); + val &=3D ~BIT_ULL(0); + wrmsrl(MSR_RMID_SNC_CONFIG, val); +} + static int resctrl_online_cpu(unsigned int cpu) { struct rdt_resource *r; =20 mutex_lock(&rdtgroup_mutex); + + if (snc_nodes_per_l3_cache > 1) + snc_remap_rmids(cpu); + for_each_capable_rdt_resource(r) domain_add_cpu(cpu, r); /* The cpu is set in default rdtgroup after online. */ @@ -910,11 +944,96 @@ static __init bool get_rdt_resources(void) return (rdt_mon_capable || rdt_alloc_capable); } =20 +/* CPU models that support MSR_RMID_SNC_CONFIG */ +static const struct x86_cpu_id snc_cpu_ids[] __initconst =3D { + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, 0), + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, 0), + X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, 0), + X86_MATCH_INTEL_FAM6_MODEL(GRANITERAPIDS_X, 0), + {} +}; + +/* + * There isn't a simple hardware bit that indicates whether a CPU is runni= ng + * in Sub NUMA Cluster (SNC) mode. Infer the state by comparing the + * ratio of NUMA nodes to L3 cache instances. + * It is not possible to accurately determine SNC state if the system is + * booted with a maxcpus=3DN parameter. That distorts the ratio of SNC nod= es + * to L3 caches. It will be OK if system is booted with hyperthreading + * disabled (since this doesn't affect the ratio). + */ +static __init int snc_get_config(void) +{ + unsigned long *node_caches; + int mem_only_nodes =3D 0; + int cpu, node, ret; + int num_l3_caches; + int cache_id; + + if (!x86_match_cpu(snc_cpu_ids)) + return 1; + + node_caches =3D bitmap_zalloc(num_possible_cpus(), GFP_KERNEL); + if (!node_caches) + return 1; + + cpus_read_lock(); + + if (num_online_cpus() !=3D num_present_cpus()) + pr_warn("Some CPUs offline, SNC detection may be incorrect\n"); + + for_each_node(node) { + cpu =3D cpumask_first(cpumask_of_node(node)); + if (cpu < nr_cpu_ids) { + cache_id =3D get_cpu_cacheinfo_id(cpu, 3); + if (cache_id !=3D -1) + set_bit(cache_id, node_caches); + } else { + mem_only_nodes++; + } + } + cpus_read_unlock(); + + num_l3_caches =3D bitmap_weight(node_caches, num_possible_cpus()); + kfree(node_caches); + + if (!num_l3_caches) + goto insane; + + /* sanity check #1: Number of CPU nodes must be multiple of num_l3_caches= */ + if ((nr_node_ids - mem_only_nodes) % num_l3_caches) + goto insane; + + ret =3D (nr_node_ids - mem_only_nodes) / num_l3_caches; + + /* sanity check #2: Only valid results are 1, 2, 3, 4 */ + switch (ret) { + case 1: + break; + case 2: + case 3: + case 4: + rdt_resources_all[RDT_RESOURCE_L3_MON].r_resctrl.scope =3D RESCTRL_NODE; + pr_info("Sub-NUMA Cluster: %d nodes per L3 cache\n", ret); + break; + default: + goto insane; + } + + return ret; +insane: + pr_warn("SNC insanity: CPU nodes =3D %d num_l3_caches =3D %d\n", + (nr_node_ids - mem_only_nodes), num_l3_caches); + return 1; +} + static __init void rdt_init_res_defs_intel(void) { struct rdt_hw_resource *hw_res; struct rdt_resource *r; =20 + snc_nodes_per_l3_cache =3D snc_get_config(); + for_each_rdt_resource(r) { hw_res =3D resctrl_to_arch_res(r); =20 --=20 2.43.0 From nobody Tue Dec 23 22:06:37 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B6D1E79945; Tue, 30 Jan 2024 22:20:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653251; cv=none; b=Nb15q72+8NKNeGFP4KTTIlmbWiYQKggAhvEIJYPFZSJwD/a33cwTq0NY691Q2krbvUQ+lp+cHpwckW5kLDdSWITXdmsHPt7CBXFLI6rJIUIpn5jIW7SRPew6eJQHU0Xal5XEXs3OcRS7tCuXs1wIB0gEQPLyR+GHOnR4VJoXJKk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706653251; c=relaxed/simple; bh=+qqwl/QGgrcrhJ6bwo0zQQqegC4LYQZlox3muG18Qkg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=L8JISikVr2U8XMK0310uulzj1GkJ1527n5J3rO/m4YlxT+UhtNBwKJf2cYNiIlpp+/ltIajo/Yo9wPhoEhkGPnjfHlNbuwcsqWG/wvacEMmV0lycC88o6I+bNe1hi11Wmu1WHaSn8Xmr29jtdQuVMkvZu7iS5S+46bWkincJ2Nw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=a4HcosPg; arc=none smtp.client-ip=192.55.52.115 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="a4HcosPg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706653249; x=1738189249; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+qqwl/QGgrcrhJ6bwo0zQQqegC4LYQZlox3muG18Qkg=; b=a4HcosPgH36/T0CRDXXGoP8KT6Bu5B2l2pMKn3zdvHlCE1CstNkO80ZC pSazcvSGJDUB47+HbyH9t+gmvXoRM5758IJI9T/LmiURfjev9shbew5GH oQ1HG6RVZQcJNCKauKy3jt5R2eZQPP5Bd+UOKEj+pRUFsJuqrE+1pZ9ti GTqJkiLOCzZIPkVBn6ma41Sfgn1ANIA70nIDAHYKB8SUQegusllqENi2b C4uiOrpiQs+okZLHVoahgfjsMZxyJaJXZsQOcbtLosSwlIXmgPu7Ag5rG tAF4OSGNWx32Z5fZn23IY2JHY4FfCbeXYuddR7pRoeSi56sf06nZrc0hV g==; X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="403041798" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="403041798" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="1119412859" X-IronPort-AV: E=Sophos;i="6.05,230,1701158400"; d="scan'208";a="1119412859" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 14:20:43 -0800 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , Drew Fustini , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck , Shaopeng Tan , Babu Moger Subject: [PATCH v15-RFC 8/8] x86/resctrl: Update documentation with Sub-NUMA cluster changes Date: Tue, 30 Jan 2024 14:20:34 -0800 Message-ID: <20240130222034.37181-9-tony.luck@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240130222034.37181-1-tony.luck@intel.com> References: <20240126223837.21835-1-tony.luck@intel.com> <20240130222034.37181-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With Sub-NUMA Cluster mode enabled the scope of monitoring resources is per-NODE instead of per-L3 cache. Suffixes of directories with "L3" in their name refer to Sub-NUMA nodes instead of L3 cache ids. Users should be aware that SNC mode also affects the amount of L3 cache available for allocation within each SNC node. Tested-by: Shaopeng Tan Reviewed-by: Peter Newman Reviewed-by: Reinette Chatre Reviewed-by: Shaopeng Tan Reviewed-by: Babu Moger Signed-off-by: Tony Luck --- Documentation/arch/x86/resctrl.rst | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/re= sctrl.rst index a6279df64a9d..15f1cff6ee76 100644 --- a/Documentation/arch/x86/resctrl.rst +++ b/Documentation/arch/x86/resctrl.rst @@ -366,10 +366,10 @@ When control is enabled all CTRL_MON groups will also= contain: When monitoring is enabled all MON groups will also contain: =20 "mon_data": - This contains a set of files organized by L3 domain and by - RDT event. E.g. on a system with two L3 domains there will - be subdirectories "mon_L3_00" and "mon_L3_01". Each of these - directories have one file per event (e.g. "llc_occupancy", + This contains a set of files organized by L3 domain or by NUMA + node (depending on whether Sub-NUMA Cluster (SNC) mode is disabled + or enabled respectively) and by RDT event. Each of these + directories has one file per event (e.g. "llc_occupancy", "mbm_total_bytes", and "mbm_local_bytes"). In a MON group these files provide a read out of the current value of the event for all tasks in the group. In CTRL_MON groups these files provide @@ -478,6 +478,23 @@ if non-contiguous 1s value is supported. On a system w= ith a 20-bit mask each bit represents 5% of the capacity of the cache. You could partition the cache into four equal parts with masks: 0x1f, 0x3e0, 0x7c00, 0xf8000. =20 +Notes on Sub-NUMA Cluster mode +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D +When SNC mode is enabled, Linux may load balance tasks between Sub-NUMA +nodes much more readily than between regular NUMA nodes since the CPUs +on Sub-NUMA nodes share the same L3 cache and the system may report +the NUMA distance between Sub-NUMA nodes with a lower value than used +for regular NUMA nodes. Users who do not bind tasks to the CPUs of a +specific Sub-NUMA node must read the "llc_occupancy", "mbm_total_bytes", +and "mbm_local_bytes" for all Sub-NUMA nodes where the tasks may execute +to get the full view of traffic for which the tasks were the source. + +The cache allocation feature still provides the same number of +bits in a mask to control allocation into the L3 cache, but each +of those ways has its capacity reduced because the cache is divided +between the SNC nodes. The values reported in the resctrl +"size" files are adjusted accordingly. + Memory bandwidth Allocation and monitoring =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --=20 2.43.0