[PATCH v6 00/40] arm_mpam: Add KVM/arm64 and resctrl glue code

Ben Horgan posted 40 patches 2 weeks, 5 days ago
Documentation/arch/arm64/index.rst          |    1 +
Documentation/arch/arm64/mpam.rst           |   72 +
Documentation/arch/arm64/silicon-errata.rst |    9 +
arch/arm64/Kconfig                          |    6 +-
arch/arm64/include/asm/el2_setup.h          |    3 +-
arch/arm64/include/asm/mpam.h               |   96 ++
arch/arm64/include/asm/resctrl.h            |    2 +
arch/arm64/include/asm/thread_info.h        |    3 +
arch/arm64/kernel/Makefile                  |    1 +
arch/arm64/kernel/cpufeature.c              |   21 +-
arch/arm64/kernel/mpam.c                    |   62 +
arch/arm64/kernel/process.c                 |    7 +
arch/arm64/kvm/hyp/include/hyp/switch.h     |   12 +-
arch/arm64/kvm/hyp/vhe/sysreg-sr.c          |   16 +
arch/arm64/kvm/sys_regs.c                   |    2 +
arch/arm64/tools/sysreg                     |    8 +
drivers/resctrl/Kconfig                     |    9 +-
drivers/resctrl/Makefile                    |    1 +
drivers/resctrl/mpam_devices.c              |  303 +++-
drivers/resctrl/mpam_internal.h             |  104 +-
drivers/resctrl/mpam_resctrl.c              | 1710 +++++++++++++++++++
drivers/resctrl/test_mpam_resctrl.c         |  315 ++++
include/linux/arm_mpam.h                    |   32 +
23 files changed, 2740 insertions(+), 55 deletions(-)
create mode 100644 Documentation/arch/arm64/mpam.rst
create mode 100644 arch/arm64/include/asm/mpam.h
create mode 100644 arch/arm64/include/asm/resctrl.h
create mode 100644 arch/arm64/kernel/mpam.c
create mode 100644 drivers/resctrl/mpam_resctrl.c
create mode 100644 drivers/resctrl/test_mpam_resctrl.c
[PATCH v6 00/40] arm_mpam: Add KVM/arm64 and resctrl glue code
Posted by Ben Horgan 2 weeks, 5 days ago
This version of the mpam missing pieces series sees a couple of things
dropped or hidden. Memory bandwith utilization with free-running counters
is dropped in preference of just always using 'mbm_event' mode (ABMC
emulation) which simplifies the code and allows for, in the future,
filtering by read/write traffic. So, for the interim, there is no memory
bandwidth utilization support. CDP is hidden behind config expert as
remount of resctrl fs could potentially lead to out of range PARTIDs being
used and the fix requires a change in fs/resctrl. The setting of MPAM2_EL2
(for pkvm/nvhe) is dropped as too expensive a write for not much value.

There are a couple of 'fixes' at the start of the series which address
problems in the base driver but are only user visible due to this series.

Changelogs in patches

Thanks for all the reviewing and testing so far. Just a bit more to get this
over the line.

There is a small build conflict with the MPAM abmc precursors series [1], which
alters some of the resctrl arch hooks. I will shortly be posting a respin
of that too.

[1] https://lore.kernel.org/lkml/20260225201905.3568624-1-ben.horgan@arm.com/

From James' cover letter:

This is the missing piece to make MPAM usable resctrl in user-space. This has
shed its debugfs code and the read/write 'event configuration' for the monitors
to make the series smaller.

This adds the arch code and KVM support first. I anticipate the whole thing
going via arm64, but if goes via tip instead, the an immutable branch with those
patches should be easy to do.

Generally the resctrl glue code works by picking what MPAM features it can expose
from the MPAM drive, then configuring the structs that back the resctrl helpers.
If your platform is sufficiently Xeon shaped, you should be able to get L2/L3 CPOR
bitmaps exposed via resctrl. CSU counters work if they are on/after the L3. MBWU
counters are considerably more hairy, and depend on hueristics around the topology,
and a bunch of stuff trying to emulate ABMC.
If it didn't pick what you wanted it to, please share the debug messages produced
when enabling dynamic debug and booting with:
| dyndbg="file mpam_resctrl.c +pl"

I've not found a platform that can test all the behaviours around the monitors,
so this is where I'd expect the most bugs.

The MPAM spec that describes all the system and MMIO registers can be found here:
https://developer.arm.com/documentation/ddi0598/db/?lang=en
(Ignored the 'RETIRED' warning - that is just arm moving the documentation around.
 This document has the best overview)


Based on v7.0-rc3

The series can be retrieved from:
https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v6

v5 can be found at:
https://lore.kernel.org/linux-arm-kernel/20260224175720.2663924-1-ben.horgan@arm.com/

v4 can be found at:
https://lore.kernel.org/linux-arm-kernel/20260203214342.584712-1-ben.horgan@arm.com/

v3 can be found at:
https://lore.kernel.org/linux-arm-kernel/20260112165914.4086692-1-ben.horgan@arm.com/

v2 can be found at:
https://lore.kernel.org/linux-arm-kernel/20251219181147.3404071-1-ben.horgan@arm.com/

rfc can be found at:
https://lore.kernel.org/linux-arm-kernel/20251205215901.17772-1-james.morse@arm.com/

Ben Horgan (11):
  arm_mpam: Reset when feature configuration bit unset
  arm64/sysreg: Add MPAMSM_EL1 register
  KVM: arm64: Preserve host MPAM configuration when changing traps
  KVM: arm64: Make MPAMSM_EL1 accesses UNDEF
  arm64: mpam: Drop the CONFIG_EXPERT restriction
  arm64: mpam: Initialise and context switch the MPAMSM_EL1 register
  arm_mpam: resctrl: Hide CDP emulation behind CONFIG_EXPERT
  arm_mpam: resctrl: Add rmid index helpers
  arm_mpam: resctrl: Wait for cacheinfo to be ready
  arm_mpam: resctrl: Add monitor initialisation and domain boilerplate
  arm64: mpam: Add initial MPAM documentation

Dave Martin (2):
  arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats
  arm_mpam: resctrl: Add kunit test for control format conversions

James Morse (22):
  arm64: mpam: Context switch the MPAM registers
  arm64: mpam: Re-initialise MPAM regs when CPU comes online
  arm64: mpam: Advertise the CPUs MPAM limits to the driver
  arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs
  arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG
    values
  KVM: arm64: Force guest EL1 to use user-space's partid configuration
  arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation
  arm_mpam: resctrl: Pick the caches we will use as resctrl resources
  arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls()
  arm_mpam: resctrl: Add resctrl_arch_get_config()
  arm_mpam: resctrl: Implement helpers to update configuration
  arm_mpam: resctrl: Add plumbing against arm64 task and cpu hooks
  arm_mpam: resctrl: Add CDP emulation
  arm_mpam: resctrl: Add support for 'MB' resource
  arm_mpam: resctrl: Add support for csu counters
  arm_mpam: resctrl: Allow resctrl to allocate monitors
  arm_mpam: resctrl: Add resctrl_arch_rmid_read()
  arm_mpam: resctrl: Update the rmid reallocation limit
  arm_mpam: resctrl: Add empty definitions for assorted resctrl
    functions
  arm64: mpam: Select ARCH_HAS_CPU_RESCTRL
  arm_mpam: resctrl: Call resctrl_init() on platforms that can support
    resctrl
  arm_mpam: Quirk CMN-650's CSU NRDY behaviour

Shanker Donthineni (4):
  arm_mpam: Add quirk framework
  arm_mpam: Add workaround for T241-MPAM-1
  arm_mpam: Add workaround for T241-MPAM-4
  arm_mpam: Add workaround for T241-MPAM-6

Zeng Heng (1):
  arm_mpam: Ensure in_reset_state is false after applying configuration

 Documentation/arch/arm64/index.rst          |    1 +
 Documentation/arch/arm64/mpam.rst           |   72 +
 Documentation/arch/arm64/silicon-errata.rst |    9 +
 arch/arm64/Kconfig                          |    6 +-
 arch/arm64/include/asm/el2_setup.h          |    3 +-
 arch/arm64/include/asm/mpam.h               |   96 ++
 arch/arm64/include/asm/resctrl.h            |    2 +
 arch/arm64/include/asm/thread_info.h        |    3 +
 arch/arm64/kernel/Makefile                  |    1 +
 arch/arm64/kernel/cpufeature.c              |   21 +-
 arch/arm64/kernel/mpam.c                    |   62 +
 arch/arm64/kernel/process.c                 |    7 +
 arch/arm64/kvm/hyp/include/hyp/switch.h     |   12 +-
 arch/arm64/kvm/hyp/vhe/sysreg-sr.c          |   16 +
 arch/arm64/kvm/sys_regs.c                   |    2 +
 arch/arm64/tools/sysreg                     |    8 +
 drivers/resctrl/Kconfig                     |    9 +-
 drivers/resctrl/Makefile                    |    1 +
 drivers/resctrl/mpam_devices.c              |  303 +++-
 drivers/resctrl/mpam_internal.h             |  104 +-
 drivers/resctrl/mpam_resctrl.c              | 1710 +++++++++++++++++++
 drivers/resctrl/test_mpam_resctrl.c         |  315 ++++
 include/linux/arm_mpam.h                    |   32 +
 23 files changed, 2740 insertions(+), 55 deletions(-)
 create mode 100644 Documentation/arch/arm64/mpam.rst
 create mode 100644 arch/arm64/include/asm/mpam.h
 create mode 100644 arch/arm64/include/asm/resctrl.h
 create mode 100644 arch/arm64/kernel/mpam.c
 create mode 100644 drivers/resctrl/mpam_resctrl.c
 create mode 100644 drivers/resctrl/test_mpam_resctrl.c

-- 
2.43.0
Re: [PATCH v6 00/40] arm_mpam: Add KVM/arm64 and resctrl glue code
Posted by Fenghua Yu an hour ago

On 3/13/26 07:45, Ben Horgan wrote:
> This version of the mpam missing pieces series sees a couple of things
> dropped or hidden. Memory bandwith utilization with free-running counters
> is dropped in preference of just always using 'mbm_event' mode (ABMC
> emulation) which simplifies the code and allows for, in the future,
> filtering by read/write traffic. So, for the interim, there is no memory
> bandwidth utilization support. CDP is hidden behind config expert as
> remount of resctrl fs could potentially lead to out of range PARTIDs being
> used and the fix requires a change in fs/resctrl. The setting of MPAM2_EL2
> (for pkvm/nvhe) is dropped as too expensive a write for not much value.
> 
> There are a couple of 'fixes' at the start of the series which address
> problems in the base driver but are only user visible due to this series.

Tested-by: Fenghua Yu <fenghuay@nvidia.com>

Thanks.

-Fenghua
Re: [PATCH v6 00/40] arm_mpam: Add KVM/arm64 and resctrl glue code
Posted by Jesse Chick 2 weeks, 2 days ago
Hi Ben,


> Thanks for all the reviewing and testing so far. Just a bit more to get this
> over the line.


I tested this patch series, specifically L3 CPOR, on an Ampere
implementation with common benchmarking tools (lmbench, multichase).


> Generally the resctrl glue code works by picking what MPAM features it can expose
> from the MPAM drive, then configuring the structs that back the resctrl helpers.
> If your platform is sufficiently Xeon shaped, you should be able to get L2/L3 CPOR
> bitmaps exposed via resctrl. CSU counters work if they are on/after the L3. MBWU
> counters are considerably more hairy, and depend on hueristics around the topology,
> and a bunch of stuff trying to emulate ABMC.


The observed latency in increasingly large R/W operations scaled as
expected across various numbers of competing processes and cache
portions. The distinct jumps in latency as successive cache level
capacities were exceeded indicates that (a) L3 CPOR is configurable from
user space via the resctrl interface as described in this series and (b)
that the feature itself is working correctly.

For the series:

Tested-by: Jesse Chick <jessechick@os.amperecomputing.com>

Thanks,
Jesse
Re: [PATCH v6 00/40] arm_mpam: Add KVM/arm64 and resctrl glue code
Posted by Ben Horgan 2 weeks ago
Hi Jesse,

On 3/17/26 00:25, Jesse Chick wrote:
> Hi Ben,
> 
> 
>> Thanks for all the reviewing and testing so far. Just a bit more to get this
>> over the line.
> 
> 
> I tested this patch series, specifically L3 CPOR, on an Ampere
> implementation with common benchmarking tools (lmbench, multichase).
> 
> 
>> Generally the resctrl glue code works by picking what MPAM features it can expose
>> from the MPAM drive, then configuring the structs that back the resctrl helpers.
>> If your platform is sufficiently Xeon shaped, you should be able to get L2/L3 CPOR
>> bitmaps exposed via resctrl. CSU counters work if they are on/after the L3. MBWU
>> counters are considerably more hairy, and depend on hueristics around the topology,
>> and a bunch of stuff trying to emulate ABMC.
> 
> 
> The observed latency in increasingly large R/W operations scaled as
> expected across various numbers of competing processes and cache
> portions. The distinct jumps in latency as successive cache level
> capacities were exceeded indicates that (a) L3 CPOR is configurable from
> user space via the resctrl interface as described in this series and (b)
> that the feature itself is working correctly.
> 
> For the series:
> 
> Tested-by: Jesse Chick <jessechick@os.amperecomputing.com>

Thanks for the testing and description.

Ben

> 
> Thanks,
> Jesse
> 
>
Re: [PATCH v6 00/40] arm_mpam: Add KVM/arm64 and resctrl glue code
Posted by Gavin Shan 1 week, 2 days ago
On 3/14/26 12:45 AM, Ben Horgan wrote:
> This version of the mpam missing pieces series sees a couple of things
> dropped or hidden. Memory bandwith utilization with free-running counters
> is dropped in preference of just always using 'mbm_event' mode (ABMC
> emulation) which simplifies the code and allows for, in the future,
> filtering by read/write traffic. So, for the interim, there is no memory
> bandwidth utilization support. CDP is hidden behind config expert as
> remount of resctrl fs could potentially lead to out of range PARTIDs being
> used and the fix requires a change in fs/resctrl. The setting of MPAM2_EL2
> (for pkvm/nvhe) is dropped as too expensive a write for not much value.
> 
> There are a couple of 'fixes' at the start of the series which address
> problems in the base driver but are only user visible due to this series.
> 
> Changelogs in patches
> 
> Thanks for all the reviewing and testing so far. Just a bit more to get this
> over the line.
> 
> There is a small build conflict with the MPAM abmc precursors series [1], which
> alters some of the resctrl arch hooks. I will shortly be posting a respin
> of that too.
> 
> [1] https://lore.kernel.org/lkml/20260225201905.3568624-1-ben.horgan@arm.com/
> 
>  From James' cover letter:
> 
> This is the missing piece to make MPAM usable resctrl in user-space. This has
> shed its debugfs code and the read/write 'event configuration' for the monitors
> to make the series smaller.
> 
> This adds the arch code and KVM support first. I anticipate the whole thing
> going via arm64, but if goes via tip instead, the an immutable branch with those
> patches should be easy to do.
> 
> Generally the resctrl glue code works by picking what MPAM features it can expose
> from the MPAM drive, then configuring the structs that back the resctrl helpers.
> If your platform is sufficiently Xeon shaped, you should be able to get L2/L3 CPOR
> bitmaps exposed via resctrl. CSU counters work if they are on/after the L3. MBWU
> counters are considerably more hairy, and depend on hueristics around the topology,
> and a bunch of stuff trying to emulate ABMC.
> If it didn't pick what you wanted it to, please share the debug messages produced
> when enabling dynamic debug and booting with:
> | dyndbg="file mpam_resctrl.c +pl"
> 
> I've not found a platform that can test all the behaviours around the monitors,
> so this is where I'd expect the most bugs.
> 
> The MPAM spec that describes all the system and MMIO registers can be found here:
> https://developer.arm.com/documentation/ddi0598/db/?lang=en
> (Ignored the 'RETIRED' warning - that is just arm moving the documentation around.
>   This document has the best overview)
> 
> 
> Based on v7.0-rc3
> 
> The series can be retrieved from:
> https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v6
> 

[...]

Retested this series on NVidia's grace-hopper machine where L3 cache partitioning
and MBW (soft) limiting worked as expected. Besides, The L3 cache monitor counters
are increased as more cache usage is observed.

Tested-by: Gavin Shan <gshan@redhat.com>

Thanks,
Gavin
Re: [PATCH v6 00/40] arm_mpam: Add KVM/arm64 and resctrl glue code
Posted by Ben Horgan 1 week, 1 day ago
Hi Gavin,

On 3/23/26 04:41, Gavin Shan wrote:
> * # Be careful, this email looks suspicious; * Out of Character: The sender is exhibiting a significant deviation from
> their usual behavior, this may indicate that their account has been compromised. Be extra cautious before opening links
> or attachments. *
> On 3/14/26 12:45 AM, Ben Horgan wrote:
>> This version of the mpam missing pieces series sees a couple of things
>> dropped or hidden. Memory bandwith utilization with free-running counters
>> is dropped in preference of just always using 'mbm_event' mode (ABMC
>> emulation) which simplifies the code and allows for, in the future,
>> filtering by read/write traffic. So, for the interim, there is no memory
>> bandwidth utilization support. CDP is hidden behind config expert as
>> remount of resctrl fs could potentially lead to out of range PARTIDs being
>> used and the fix requires a change in fs/resctrl. The setting of MPAM2_EL2
>> (for pkvm/nvhe) is dropped as too expensive a write for not much value.
>>
>> There are a couple of 'fixes' at the start of the series which address
>> problems in the base driver but are only user visible due to this series.
>>
>> Changelogs in patches
>>
>> Thanks for all the reviewing and testing so far. Just a bit more to get this
>> over the line.
>>
>> There is a small build conflict with the MPAM abmc precursors series [1], which
>> alters some of the resctrl arch hooks. I will shortly be posting a respin
>> of that too.
>>
>> [1] https://lore.kernel.org/lkml/20260225201905.3568624-1-ben.horgan@arm.com/
>>
>>  From James' cover letter:
>>
>> This is the missing piece to make MPAM usable resctrl in user-space. This has
>> shed its debugfs code and the read/write 'event configuration' for the monitors
>> to make the series smaller.
>>
>> This adds the arch code and KVM support first. I anticipate the whole thing
>> going via arm64, but if goes via tip instead, the an immutable branch with those
>> patches should be easy to do.
>>
>> Generally the resctrl glue code works by picking what MPAM features it can expose
>> from the MPAM drive, then configuring the structs that back the resctrl helpers.
>> If your platform is sufficiently Xeon shaped, you should be able to get L2/L3 CPOR
>> bitmaps exposed via resctrl. CSU counters work if they are on/after the L3. MBWU
>> counters are considerably more hairy, and depend on hueristics around the topology,
>> and a bunch of stuff trying to emulate ABMC.
>> If it didn't pick what you wanted it to, please share the debug messages produced
>> when enabling dynamic debug and booting with:
>> | dyndbg="file mpam_resctrl.c +pl"
>>
>> I've not found a platform that can test all the behaviours around the monitors,
>> so this is where I'd expect the most bugs.
>>
>> The MPAM spec that describes all the system and MMIO registers can be found here:
>> https://developer.arm.com/documentation/ddi0598/db/?lang=en
>> (Ignored the 'RETIRED' warning - that is just arm moving the documentation around.
>>   This document has the best overview)
>>
>>
>> Based on v7.0-rc3
>>
>> The series can be retrieved from:
>> https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v6
>>
> 
> [...]
> 
> Retested this series on NVidia's grace-hopper machine where L3 cache partitioning
> and MBW (soft) limiting worked as expected. Besides, The L3 cache monitor counters
> are increased as more cache usage is observed.
> 
> Tested-by: Gavin Shan <gshan@redhat.com>

Thanks for testing and all the reviews.

Ben

> 
> Thanks,
> Gavin
> 
>