[PATCH v7 0/8] Add support for Sub-NUMA cluster (SNC) systems

Tony Luck posted 8 patches 2 years, 2 months ago
There is a newer version of this series
Documentation/arch/x86/resctrl.rst        |  23 +-
include/linux/resctrl.h                   |  85 +++--
arch/x86/include/asm/msr-index.h          |   1 +
arch/x86/kernel/cpu/resctrl/internal.h    |  66 ++--
arch/x86/kernel/cpu/resctrl/core.c        | 400 +++++++++++++++++-----
arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  58 ++--
arch/x86/kernel/cpu/resctrl/monitor.c     |  58 ++--
arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  14 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 132 +++----
9 files changed, 591 insertions(+), 246 deletions(-)
[PATCH v7 0/8] Add support for Sub-NUMA cluster (SNC) systems
Posted by Tony Luck 2 years, 2 months ago
The Sub-NUMA cluster feature on some Intel processors partitions
the CPUs that share an L3 cache into two or more sets. This plays
havoc with the Resource Director Technology (RDT) monitoring features.
Prior to this patch Intel has advised that SNC and RDT are incompatible.

Some of these CPU support an MSR that can partition the RMID
counters in the same way. This allows for monitoring features
to be used (with the caveat that memory accesses between different
SNC NUMA nodes may still not be counted accuratlely.

Note that this patch series improves resctrl reporting considerably
on systems with SNC enabled, but there will still be some anomalies
for processes accessing memory from other sub-NUMA nodes.

Signed-off-by: Tony Luck <tony.luck@intel.com>

Tony Luck (8):
  x86/resctrl: Prepare for new domain scope
  x86/resctrl: Prepare to split rdt_domain structure
  x86/resctrl: Prepare for different scope for control/monitor
    operations
  x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
  x86/resctrl: Add node-scope to the options for feature scope
  x86/resctrl: Introduce snc_nodes_per_l3_cache
  x86/resctrl: Sub NUMA Cluster detection and enable
  x86/resctrl: Update documentation with Sub-NUMA cluster changes

 Documentation/arch/x86/resctrl.rst        |  23 +-
 include/linux/resctrl.h                   |  85 +++--
 arch/x86/include/asm/msr-index.h          |   1 +
 arch/x86/kernel/cpu/resctrl/internal.h    |  66 ++--
 arch/x86/kernel/cpu/resctrl/core.c        | 400 +++++++++++++++++-----
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  58 ++--
 arch/x86/kernel/cpu/resctrl/monitor.c     |  58 ++--
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  14 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 132 +++----
 9 files changed, 591 insertions(+), 246 deletions(-)


base-commit: 6465e260f48790807eef06b583b38ca9789b6072
-- 
2.41.0
RE: [PATCH v7 0/8] Add support for Sub-NUMA cluster (SNC) systems
Posted by Luck, Tony 2 years, 2 months ago
> The Sub-NUMA cluster feature on some Intel processors partitions
> the CPUs that share an L3 cache into two or more sets. This plays
> havoc with the Resource Director Technology (RDT) monitoring features.
> Prior to this patch Intel has advised that SNC and RDT are incompatible.
>
> Some of these CPU support an MSR that can partition the RMID
> counters in the same way. This allows for monitoring features
> to be used (with the caveat that memory accesses between different
> SNC NUMA nodes may still not be counted accuratlely.
>
> Note that this patch series improves resctrl reporting considerably
> on systems with SNC enabled, but there will still be some anomalies
> for processes accessing memory from other sub-NUMA nodes.

Bother .. forgot to add the changes since last version summary
to the cover letter. I fixed all the issues called out by Peter Newman
in his review of v6 series. Specific details are included in each patch
(except for patch 0005 which is unchanged).

I added Peter's "Reviewed-by" to patches where he offered it AND
where I didn't make substantive changes (parts 4, 5, 6, 7)

-Tony
Re: [PATCH v7 0/8] Add support for Sub-NUMA cluster (SNC) systems
Posted by Reinette Chatre 2 years, 2 months ago

On 10/3/2023 9:16 AM, Luck, Tony wrote:
>> The Sub-NUMA cluster feature on some Intel processors partitions
>> the CPUs that share an L3 cache into two or more sets. This plays
>> havoc with the Resource Director Technology (RDT) monitoring features.
>> Prior to this patch Intel has advised that SNC and RDT are incompatible.
>>
>> Some of these CPU support an MSR that can partition the RMID
>> counters in the same way. This allows for monitoring features
>> to be used (with the caveat that memory accesses between different
>> SNC NUMA nodes may still not be counted accuratlely.

The typo that I pointed out in V4 as well as V5 remains.
Not fixing something this fundamental reflects poorly on the rest
of this work.

Reinette