[PATCH v3 00/10] selftests/resctrl: Fixes and improvements focused on Intel platforms

Reinette Chatre posted 10 patches 3 weeks, 3 days ago
There is a newer version of this series
tools/testing/selftests/resctrl/cache.c       |  30 ++--
tools/testing/selftests/resctrl/cat_test.c    |  41 ++----
tools/testing/selftests/resctrl/cmt_test.c    |  36 ++++-
tools/testing/selftests/resctrl/fill_buf.c    |   4 +-
tools/testing/selftests/resctrl/mba_test.c    |   6 +-
tools/testing/selftests/resctrl/mbm_test.c    |   6 +-
tools/testing/selftests/resctrl/resctrl.h     |  20 ++-
tools/testing/selftests/resctrl/resctrl_val.c | 135 +++++++++++++-----
8 files changed, 179 insertions(+), 99 deletions(-)
[PATCH v3 00/10] selftests/resctrl: Fixes and improvements focused on Intel platforms
Posted by Reinette Chatre 3 weeks, 3 days ago
Changes since v2:
- v2: https://lore.kernel.org/linux-patches/cover.1772582958.git.reinette.chatre@intel.com/
- Rebased on top of v7.0-rc3.
- Split "selftests/resctrl: Improve accuracy of cache occupancy test" into
  changes impacting L3 and L2 respectively. (Ilpo)
- "long_mask" -> "full_mask", "return_value" -> "measurement", "org_count"
  -> "orig_count". (Ilpo)
- Use PATH_MAX where appropriate. (Ilpo)
- Handle errors first to reduce indentation. (Ilpo)
- Detailed changes in changelogs.
- No functional changes since v2. Series tested by running 20 iterations of all
  tests on Emerald Rapids, Granite Rapids, Sapphire Rapids, Ice Lake, Sierra
  Forest, and Broadwell.

Changes since v1:
- v1: https://lore.kernel.org/lkml/cover.1770406608.git.reinette.chatre@intel.com/
- The new perf interface that resctrl selftests can utilize has been accepted and
  merged into v7.0-rc2. This series can thus now be considered for inclusion.
  For reference,
  commit 6a8a48644c4b ("perf/x86/intel/uncore: Add per-scheduler IMC CAS count events")
  The resctrl selftest changes making use of the new perf interface are backward
  compatible. The selftests do not require a v7.0-rc2 kernel to run but the
  tests can only pass on recent Intel platforms running v7.0-rc2 or later.
- Combine the two outstanding resctrl selftest submissions into one series
  for easier tracking:
  https://lore.kernel.org/lkml/084e82b5c29d75f16f24af8768d50d39ba0118a5.1769101788.git.reinette.chatre@intel.com/
  https://lore.kernel.org/lkml/cover.1770406608.git.reinette.chatre@intel.com/
- Fix typo in changelog of "selftests/resctrl: Improve accuracy of cache
  occupancy test": "the data my be in L2" -> "the data may be in L2"
- Add Zide Chen's RB tags.

Cover letter updated to be accurate wrt perf changes:

The resctrl selftests fail on recent Intel platforms. Intermittent failures
in the CAT test and permanent failures of MBM and MBA tests on new platforms
like Sierra Forest and Granite Rapids.

The MBM and MBA resctrl selftests both generate memory traffic and compare the
memory bandwidth measurements between the iMC PMUs and MBM to determine pass or
fail. Both these tests are failing on recent platforms like Sierra Forest and
Granite Rapids that have two events that need to be read and combined
for a total memory bandwidth count instead of the single event available on
earlier platforms.

resctrl selftests prefer to obtain event details via sysfs instead of adding
model specific details on which events to read. Enhancements to perf to expose
the new event details are available since:
 commit 6a8a48644c4b ("perf/x86/intel/uncore: Add per-scheduler IMC CAS count events")
This series demonstrates use of the new sysfs interface to perf
to obtain accurate iMC read memory bandwidth measurements.

An additional issue with all the tests is that these selftests are part
performance tests and determine pass/fail on performance heuristics selected
after running the tests on a variety of platforms. When new platforms
arrive the previous heuristics may cause the tests to fail. These failures are
not because of an issue with the resctrl subsystem the tests intend to test
but because of the architectural changes in the new platforms.

Adapt the resctrl tests to not be as sensitive to architectural changes
while adjusting the remaining heuristics to ensure tests pass on a variety
of platforms. More details in individual patches.

Tested by running 100 iterations of all tests on Emerald Rapids, Granite
Rapids, Sapphire Rapids, Ice Lake, Sierra Forest, and Broadwell.

Reinette Chatre (10):
  selftests/resctrl: Improve accuracy of cache occupancy test
  selftests/resctrl: Reduce interference from L2 occupancy during cache
    occupancy test
  selftests/resctrl: Do not store iMC counter value in counter config
    structure
  selftests/resctrl: Prepare for parsing multiple events per iMC
  selftests/resctrl: Support multiple events associated with iMC
  selftests/resctrl: Increase size of buffer used in MBM and MBA tests
  selftests/resctrl: Raise threshold at which MBM and PMU values are
    compared
  selftests/resctrl: Remove requirement on cache miss rate
  selftests/resctrl: Simplify perf usage in CAT test
  selftests/resctrl: Reduce L2 impact on CAT test

 tools/testing/selftests/resctrl/cache.c       |  30 ++--
 tools/testing/selftests/resctrl/cat_test.c    |  41 ++----
 tools/testing/selftests/resctrl/cmt_test.c    |  36 ++++-
 tools/testing/selftests/resctrl/fill_buf.c    |   4 +-
 tools/testing/selftests/resctrl/mba_test.c    |   6 +-
 tools/testing/selftests/resctrl/mbm_test.c    |   6 +-
 tools/testing/selftests/resctrl/resctrl.h     |  20 ++-
 tools/testing/selftests/resctrl/resctrl_val.c | 135 +++++++++++++-----
 8 files changed, 179 insertions(+), 99 deletions(-)

-- 
2.50.1
Re: [PATCH v3 00/10] selftests/resctrl: Fixes and improvements focused on Intel platforms
Posted by Shuah Khan 6 days, 12 hours ago
On 3/13/26 14:32, Reinette Chatre wrote:
> Changes since v2:
> - v2: https://lore.kernel.org/linux-patches/cover.1772582958.git.reinette.chatre@intel.com/
> - Rebased on top of v7.0-rc3.
> - Split "selftests/resctrl: Improve accuracy of cache occupancy test" into
>    changes impacting L3 and L2 respectively. (Ilpo)
> - "long_mask" -> "full_mask", "return_value" -> "measurement", "org_count"
>    -> "orig_count". (Ilpo)
> - Use PATH_MAX where appropriate. (Ilpo)
> - Handle errors first to reduce indentation. (Ilpo)
> - Detailed changes in changelogs.
> - No functional changes since v2. Series tested by running 20 iterations of all
>    tests on Emerald Rapids, Granite Rapids, Sapphire Rapids, Ice Lake, Sierra
>    Forest, and Broadwell.
> 
> Changes since v1:
> - v1: https://lore.kernel.org/lkml/cover.1770406608.git.reinette.chatre@intel.com/
> - The new perf interface that resctrl selftests can utilize has been accepted and
>    merged into v7.0-rc2. This series can thus now be considered for inclusion.
>    For reference,
>    commit 6a8a48644c4b ("perf/x86/intel/uncore: Add per-scheduler IMC CAS count events")
>    The resctrl selftest changes making use of the new perf interface are backward
>    compatible. The selftests do not require a v7.0-rc2 kernel to run but the
>    tests can only pass on recent Intel platforms running v7.0-rc2 or later.
> - Combine the two outstanding resctrl selftest submissions into one series
>    for easier tracking:
>    https://lore.kernel.org/lkml/084e82b5c29d75f16f24af8768d50d39ba0118a5.1769101788.git.reinette.chatre@intel.com/
>    https://lore.kernel.org/lkml/cover.1770406608.git.reinette.chatre@intel.com/
> - Fix typo in changelog of "selftests/resctrl: Improve accuracy of cache
>    occupancy test": "the data my be in L2" -> "the data may be in L2"
> - Add Zide Chen's RB tags.
> 
> Cover letter updated to be accurate wrt perf changes:
> 
> The resctrl selftests fail on recent Intel platforms. Intermittent failures
> in the CAT test and permanent failures of MBM and MBA tests on new platforms
> like Sierra Forest and Granite Rapids.
> 
> The MBM and MBA resctrl selftests both generate memory traffic and compare the
> memory bandwidth measurements between the iMC PMUs and MBM to determine pass or
> fail. Both these tests are failing on recent platforms like Sierra Forest and
> Granite Rapids that have two events that need to be read and combined
> for a total memory bandwidth count instead of the single event available on
> earlier platforms.
> 
> resctrl selftests prefer to obtain event details via sysfs instead of adding
> model specific details on which events to read. Enhancements to perf to expose
> the new event details are available since:
>   commit 6a8a48644c4b ("perf/x86/intel/uncore: Add per-scheduler IMC CAS count events")
> This series demonstrates use of the new sysfs interface to perf
> to obtain accurate iMC read memory bandwidth measurements.
> 
> An additional issue with all the tests is that these selftests are part
> performance tests and determine pass/fail on performance heuristics selected
> after running the tests on a variety of platforms. When new platforms
> arrive the previous heuristics may cause the tests to fail. These failures are
> not because of an issue with the resctrl subsystem the tests intend to test
> but because of the architectural changes in the new platforms.
> 
> Adapt the resctrl tests to not be as sensitive to architectural changes
> while adjusting the remaining heuristics to ensure tests pass on a variety
> of platforms. More details in individual patches.
> 
> Tested by running 100 iterations of all tests on Emerald Rapids, Granite
> Rapids, Sapphire Rapids, Ice Lake, Sierra Forest, and Broadwell.
> 
> Reinette Chatre (10):
>    selftests/resctrl: Improve accuracy of cache occupancy test
>    selftests/resctrl: Reduce interference from L2 occupancy during cache
>      occupancy test
>    selftests/resctrl: Do not store iMC counter value in counter config
>      structure
>    selftests/resctrl: Prepare for parsing multiple events per iMC
>    selftests/resctrl: Support multiple events associated with iMC
>    selftests/resctrl: Increase size of buffer used in MBM and MBA tests
>    selftests/resctrl: Raise threshold at which MBM and PMU values are
>      compared
>    selftests/resctrl: Remove requirement on cache miss rate
>    selftests/resctrl: Simplify perf usage in CAT test
>    selftests/resctrl: Reduce L2 impact on CAT test
> 

Let me know if this series is ready to go into Linux 7.1

thanks,
-- Shuah
Re: [PATCH v3 00/10] selftests/resctrl: Fixes and improvements focused on Intel platforms
Posted by Reinette Chatre 6 days, 11 hours ago
Hi Shuah,

On 3/31/26 12:13 PM, Shuah Khan wrote:
> Let me know if this series is ready to go into Linux 7.1

Thank you very much for checking in on this series. Yes, this is ready to go in at your convenience.

Reinette