.../admin-guide/kernel-parameters.txt | 2 +- Documentation/filesystems/resctrl.rst | 84 ++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 7 + arch/x86/kernel/cpu/resctrl/core.c | 17 + arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 35 + arch/x86/kernel/cpu/resctrl/internal.h | 27 + arch/x86/kernel/cpu/scattered.c | 1 + fs/resctrl/internal.h | 6 + fs/resctrl/rdtgroup.c | 784 ++++++++++++++++++ include/linux/resctrl.h | 23 + include/linux/resctrl_types.h | 46 + 12 files changed, 1032 insertions(+), 1 deletion(-)
Hi,
This series adds support for AMD's Privilege-Level Zero Association
(PLZA) so kernel work can be assigned to a resctrl group, and wires it
up through a small generic "kernel mode" (kmode) layer in fs/resctrl
so future architectures can plug in without touching core resctrl.
The features are documented in:
AMD64 Zen6 Platform Quality of Service (PQOS) Extensions,
Publication # 69193 Revision 1.00, Issue Date March 2026
available at https://bugzilla.kernel.org/show_bug.cgi?id=206537
The patches are based on top of commit (7.1.0-rc1)
Commit 3382329a309d Merge branch into tip/master: 'timers/clocksource'.
Background
==========
Customers have identified an issue while using the QoS resource Control
feature. If a memory bandwidth associated with a CLOSID is aggressively
throttled, and it moves into Kernel mode, the Kernel operations are also
aggressively throttled. This can stall forward progress and eventually
degrade overall system performance.
Privilege-Level Zero Association (PLZA) allows the user to specify a CLOSID
and/or RMID associated with execution in Privilege-Level Zero. When enabled
on a HW thread, when the thread enters Privilege-Level Zero, transactions
associated with that thread will be associated with the PLZA CLOSID and/or
RMID. Otherwise, the HW thread will be associated with the CLOSID and RMID
identified by PQR_ASSOC.
Design
======
A new sysfs file, info/kernel_mode, holds a single global policy that
selects what kernel work is steered and which rdtgroup it is steered
to. Reads describe the supported modes and the currently-active
binding; writes change the policy or rebind to a different group.
Look at the thread below for design discussion.
https://lore.kernel.org/lkml/14a8ad0a-e842-4268-871a-0762f1169e03@intel.com/
Per-rdtgroup files kmode_cpus and kmode_cpus_list scope the binding
to a subset of online CPUs without unbind/rebind churn. They are
visible only on the group that is currently the active kernel-mode
binding.
The arch hooks (resctrl_arch_get_kmode_support,
resctrl_arch_configure_kmode) keep the fs/resctrl layer arch-neutral.
Only AMD PLZA is wired up here; Intel and ARM can add their own
support later by implementing the hooks.
Layout
======
01-02 x86: PLZA CPU feature + MSR/data-structure plumbing.
03-05 fs/resctrl + x86: kmode data structures, arch hooks, and
population of supported modes.
06-08 fs/resctrl: global kmode config, info/kernel_mode read/write
and documentation.
09 fs/resctrl: reset the binding when the bound rdtgroup is
removed.
10-12 fs/resctrl: per-rdtgroup kmode_cpus[_list] - expose, gate
visibility on the bound group, and allow incremental writes.
Examples
========
(See Documentation/filesystems/resctrl.rst, "kernel_mode" and
"kmode_cpus" sections, for the full UAPI.)
# Mount resctrl
# mount -t resctrl resctrl /sys/fs/resctrl
# cd /sys/fs/resctrl
# Read the supported modes. The active mode is bracketed and reports
# the bound "<ctrl>/<mon>/" group; other supported modes report
# ":group=none" because nothing is bound to them.
# cat info/kernel_mode
[inherit_ctrl_and_mon:group=//]
global_assign_ctrl_inherit_mon_per_cpu:group=none
global_assign_ctrl_assign_mon_per_cpu:group=none
# Create a CTRL_MON group plus a MON child and bind both the kernel
# CLOSID and RMID to them.
# mkdir ctrl1
# mkdir ctrl1/mon_groups/mon1
# echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" \
> info/kernel_mode
# cat info/kernel_mode
inherit_ctrl_and_mon:group=none
global_assign_ctrl_inherit_mon_per_cpu:group=none
[global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]
# kmode_cpus and kmode_cpus_list are visible only on the bound group.
# ls ctrl1/kmode_cpus*
ctrl1/kmode_cpus ctrl1/kmode_cpus_list
# Restrict the binding to a CPU subset; the write is incremental.
# echo 0-3 > ctrl1/kmode_cpus_list
# cat ctrl1/kmode_cpus
f
# cat ctrl1/kmode_cpus_list
0-3
# Empty masks are rejected; use info/kernel_mode to reset to
# "every online CPU".
# echo "" > ctrl1/kmode_cpus_list
bash: echo: write error: Invalid argument
# cat info/last_cmd_status
Empty mask not allowed; use info/kernel_mode to unbind
# Disable kernel-mode steering (back to inherit, default group).
# echo "inherit_ctrl_and_mon" > info/kernel_mode
Tested on AMD with PLZA; the generic bits build clean on x86 without
PLZA support and are no-ops at runtime.
Changelog
=========
v3:
- Generalise the layer beyond AMD: rename "PLZA mode" to "kernel
mode" (kmode) in code, sysfs, and Documentation. The public
interface is now info/kernel_mode and per-group kmode_cpus[_list].
- info/kernel_mode UAPI cleanups: ":group=none" instead of
":group=uninitialized"; designated initialisers + static_assert
for the mode-name table; strim() the input; clearer error
messages via last_cmd_status.
- kmode_cpus / kmode_cpus_list:
* 0010 exposes them read-only on every group.
* 0011 toggles their visibility via kernfs_show() so they
appear only on the rdtgroup currently bound to the active
kernel mode.
* 0012 (new) makes them writable: incremental
enable/disable deltas via resctrl_arch_configure_kmode(),
empty masks rejected with -EINVAL ("use info/kernel_mode
to unbind"), offline CPUs rejected, defensive -EBUSY for
stale fds opened before an info/kernel_mode rebind.
- 0009: reset the binding when the bound rdtgroup is removed,
instead of leaving stale state.
- Kerneldoc/comment cleanups across the series; Documentation
updated alongside the UAPI changes.
v2:
This is similar to RFC with new proposal. Names of the some interfaces
are not final. Lets fix that later as we move forward.
Separated the two features: Global Bandwidth Enforcement (GLBE) and
Privilege Level Zero Association (PLZA).
This series only adds support for PLZA.
Used the name of the feature as kmode instead of PLZA. That can be changed as well.
Tony suggested using global variables to store the kernel mode
CLOSID and RMID. However, the kernel mode CLOSID and RMID are
coming from rdtgroup structure with the new interface. Accessing
them requires holding the associated lock, which would make the
context switch path unnecessarily expensive. So, dropped the idea.
https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
Let me know if there are other ways to optimize this.
Patch 1: Data structures and arch hook: Add resctrl_kmode,
resctrl_kmode_cfg, kernel-mode bits, and resctrl_arch_get_kmode_cfg()
for generic resctrl kernel mode (e.g. PLZA).
Patch 2: Implement resctrl_arch_get_kmode_cfg() on x86, add global resctrl_kcfg
and resctrl_kmode_init() to set default kmode.
Patch 3: Add info/kernel_mode and resctrl_kernel_mode_show() to list supported
kernel modes and show the current one in brackets.
Patch 4: Add x86 PLZA support and boot option rdt=plza.
Patch 5: Add supported modes from CPUID.
Patch 6: Add rdt_kmode_enable_key and arch enable/disable helpers so PLZA only
touches fast paths when enabled.
Patch 7: Add MSR_IA32_PQR_PLZA_ASSOC, bit defines, and union qos_pqr_plza_assoc
for programming PLZA.
Patch 8: Add Per-CPU and per-task state.
Patch 9: Add resctrl_arch_configure_kmode() and resctrl_arch_set_kmode()
to program PLZA per domain and set/clear it on a CPU.
Patch 10: In the sched-in path, program MSR_IA32_PQR_PLZA_ASSOC from task or
per-CPU kmode; only write when kmode changes; guard with rdt_kmode_enable_key.
Patch 11: Add write handler so the current kernel mode can be set by name.
Patch 12: Add info/kernel_mode_assignment and show which rdtgroup is assigned
for kernel mode in CTRL_MON/MON/ form.
Patch 13: Add write handler to assign/clear the group used for kernel mode;
enforce single assignment and clear on rmdir.
Patch 14: Update per-CPU PLZA state when its cpu_mask changes (add/remove CPUs)
via cpus_write_kmode() and helpers.
Patch 15: Refactor so task list respects t->kmode when the group has kmode (PLZA),
so tasks are shown correctly.
v2: https://lore.kernel.org/lkml/cover.1773347820.git.babu.moger@amd.com/
v1: https://lore.kernel.org/lkml/cover.1769029977.git.babu.moger@amd.com/
Babu Moger (12):
x86/resctrl: Support Privilege-Level Zero Association (PLZA)
x86/resctrl: Add data structures and definitions for PLZA configuration
fs/resctrl: Add kernel mode (kmode) data structures and arch hook
x86,fs/resctrl: Program PLZA through kmode arch hooks
x86/resctrl: Initialize supported kernel modes for PLZA
fs/resctrl: Initialize the global kernel-mode policy at subsystem init
fs/resctrl: Add info/kernel_mode for kernel-mode policy introspection
fs/resctrl: Make info/kernel_mode writable and identify the bound group
fs/resctrl: Reset kernel-mode binding when its rdtgroup goes away
fs/resctrl: Expose kmode_cpus / kmode_cpus_list per rdtgroup
resctrl: Hide kmode_cpus[_list] on groups not bound to kernel-mode
fs/resctrl: Allow user space to write kmode_cpus / kmode_cpus_list
Documentation/filesystems/resctrl.rst | ...
arch/x86/kernel/cpu/resctrl/... | ...
fs/resctrl/... | ...
include/linux/resctrl.h | ...
include/linux/resctrl_types.h | ...
N files changed, X insertions(+), Y deletions(-)
--
2.43.0
Babu Moger (12):
x86/resctrl: Support Privilege-Level Zero Association (PLZA)
x86/resctrl: Add data structures and definitions for PLZA
configuration
fs/resctrl: Add kernel mode (kmode) data structures and arch hook
x86,fs/resctrl: Program PLZA through kmode arch hooks
x86/resctrl: Initialize supported kernel modes for PLZA
fs/resctrl: Initialize the global kernel-mode policy at subsystem init
fs/resctrl: Add info/kernel_mode for kernel-mode policy introspection
fs/resctrl: Make info/kernel_mode writable and identify the bound
group
fs/resctrl: Reset kernel-mode binding when its rdtgroup goes away
fs/resctrl: Expose kmode_cpus / kmode_cpus_list per rdtgroup
resctrl: Hide kmode_cpus[_list] on groups not bound to kernel-mode
fs/resctrl: Allow user space to write kmode_cpus / kmode_cpus_list
.../admin-guide/kernel-parameters.txt | 2 +-
Documentation/filesystems/resctrl.rst | 84 ++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 7 +
arch/x86/kernel/cpu/resctrl/core.c | 17 +
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 35 +
arch/x86/kernel/cpu/resctrl/internal.h | 27 +
arch/x86/kernel/cpu/scattered.c | 1 +
fs/resctrl/internal.h | 6 +
fs/resctrl/rdtgroup.c | 784 ++++++++++++++++++
include/linux/resctrl.h | 23 +
include/linux/resctrl_types.h | 46 +
12 files changed, 1032 insertions(+), 1 deletion(-)
--
2.43.0
Hi Babu, On 4/30/26 4:24 PM, Babu Moger wrote: > Design > ====== > > A new sysfs file, info/kernel_mode, holds a single global policy that > selects what kernel work is steered and which rdtgroup it is steered How should "selects *what* kernel work is steered" be interpreted? Do these modes not all apply to *all* kernel work? > to. Reads describe the supported modes and the currently-active > binding; writes change the policy or rebind to a different group. > Look at the thread below for design discussion. > https://lore.kernel.org/lkml/14a8ad0a-e842-4268-871a-0762f1169e03@intel.com/ > ... > Examples > ======== > > (See Documentation/filesystems/resctrl.rst, "kernel_mode" and > "kmode_cpus" sections, for the full UAPI.) > > # Mount resctrl > # mount -t resctrl resctrl /sys/fs/resctrl > # cd /sys/fs/resctrl > > # Read the supported modes. The active mode is bracketed and reports > # the bound "<ctrl>/<mon>/" group; other supported modes report > # ":group=none" because nothing is bound to them. > # cat info/kernel_mode > [inherit_ctrl_and_mon:group=//] This is unexpected since associating a group to this mode implies that this group is used to manage allocations and monitoring of kernel work but this is not true, right? From what I understand there should be no group associated with this default "inherit_ctrl_and_mon" mode. > global_assign_ctrl_inherit_mon_per_cpu:group=none > global_assign_ctrl_assign_mon_per_cpu:group=none nit: "none" does not reflect state as clearly as "unset"/"uninitialized"/"NA" > > # Create a CTRL_MON group plus a MON child and bind both the kernel > # CLOSID and RMID to them. > # mkdir ctrl1 > # mkdir ctrl1/mon_groups/mon1 > # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" \ > > info/kernel_mode > # cat info/kernel_mode > inherit_ctrl_and_mon:group=none > global_assign_ctrl_inherit_mon_per_cpu:group=none > [global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/] > > # kmode_cpus and kmode_cpus_list are visible only on the bound group. > # ls ctrl1/kmode_cpus* > ctrl1/kmode_cpus ctrl1/kmode_cpus_list Since it is ctrl1/mon1 that was bound, should these CPU files not appear in ctrl1/mon_groups/mon1 ? > > # Restrict the binding to a CPU subset; the write is incremental. Does "incremental" mean that if the file contains CPUs 0-3 then writing "4" would set the CPUs to 0-4? This does not sound right since it is expected that user space can remove CPUs also? > # echo 0-3 > ctrl1/kmode_cpus_list > # cat ctrl1/kmode_cpus > f > # cat ctrl1/kmode_cpus_list > 0-3 > > # Empty masks are rejected; use info/kernel_mode to reset to > # "every online CPU". > # echo "" > ctrl1/kmode_cpus_list > bash: echo: write error: Invalid argument > # cat info/last_cmd_status > Empty mask not allowed; use info/kernel_mode to unbind Why are empty masks rejected/not allowed? > > # Disable kernel-mode steering (back to inherit, default group). This sounds like kernel work is steered to default group which I do not think is accurate for the "inherit_ctrl_and_mon" mode. > # echo "inherit_ctrl_and_mon" > info/kernel_mode > > Tested on AMD with PLZA; the generic bits build clean on x86 without > PLZA support and are no-ops at runtime. Reinette
On 6/11/2026 4:53 PM, Reinette Chatre wrote: > Hi Babu, > > On 4/30/26 4:24 PM, Babu Moger wrote: >> Design >> ====== >> >> A new sysfs file, info/kernel_mode, holds a single global policy that >> selects what kernel work is steered and which rdtgroup it is steered > > How should "selects *what* kernel work is steered" be interpreted? Do these > modes not all apply to *all* kernel work? How about? A new sysfs file, info/kernel_mode, holds a single global policy for kernel contexts and the rdtgroup associated with the policy. > >> to. Reads describe the supported modes and the currently-active >> binding; writes change the policy or rebind to a different group. >> Look at the thread below for design discussion. >> https://lore.kernel.org/lkml/14a8ad0a-e842-4268-871a-0762f1169e03@intel.com/ >> > > ... > >> Examples >> ======== >> >> (See Documentation/filesystems/resctrl.rst, "kernel_mode" and >> "kmode_cpus" sections, for the full UAPI.) >> >> # Mount resctrl >> # mount -t resctrl resctrl /sys/fs/resctrl >> # cd /sys/fs/resctrl >> >> # Read the supported modes. The active mode is bracketed and reports >> # the bound "<ctrl>/<mon>/" group; other supported modes report >> # ":group=none" because nothing is bound to them. >> # cat info/kernel_mode >> [inherit_ctrl_and_mon:group=//] > > This is unexpected since associating a group to this mode implies that this > group is used to manage allocations and monitoring of kernel work but this > is not true, right? From what I understand there should be no group associated with > this default "inherit_ctrl_and_mon" mode. The default mode is "inherit_ctrl_and_mon", where both user mode and kernel mode share the same CLOSID and RMID. This is current mode (without this series). I thought we are going to set the default mode with the default group when system boots up. No? > >> global_assign_ctrl_inherit_mon_per_cpu:group=none >> global_assign_ctrl_assign_mon_per_cpu:group=none > > nit: "none" does not reflect state as clearly as "unset"/"uninitialized"/"NA" Lets go with "uninitialized". > >> >> # Create a CTRL_MON group plus a MON child and bind both the kernel >> # CLOSID and RMID to them. >> # mkdir ctrl1 >> # mkdir ctrl1/mon_groups/mon1 >> # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" \ >> > info/kernel_mode >> # cat info/kernel_mode >> inherit_ctrl_and_mon:group=none >> global_assign_ctrl_inherit_mon_per_cpu:group=none >> [global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/] >> >> # kmode_cpus and kmode_cpus_list are visible only on the bound group. >> # ls ctrl1/kmode_cpus* >> ctrl1/kmode_cpus ctrl1/kmode_cpus_list > > Since it is ctrl1/mon1 that was bound, should these CPU files not appear > in ctrl1/mon_groups/mon1 ? Correct. Will fix it. >> >> # Restrict the binding to a CPU subset; the write is incremental. > > Does "incremental" mean that if the file contains CPUs 0-3 then writing > "4" would set the CPUs to 0-4? This does not sound right since it is > expected that user space can remove CPUs also? Will remove incremental. Writing "4" will remove 0-3 and keep only 4. > >> # echo 0-3 > ctrl1/kmode_cpus_list >> # cat ctrl1/kmode_cpus >> f >> # cat ctrl1/kmode_cpus_list >> 0-3 >> >> # Empty masks are rejected; use info/kernel_mode to reset to >> # "every online CPU". >> # echo "" > ctrl1/kmode_cpus_list >> bash: echo: write error: Invalid argument >> # cat info/last_cmd_status >> Empty mask not allowed; use info/kernel_mode to unbind > > Why are empty masks rejected/not allowed? No specific reason. When the mode is switched, we discussed earlier to globally apply the mode to all the online CPUs. At this point reading "kmode_cpus_list" will still report empty. Users can change it to selectively apply the mode by writing to "kmode_cpus_list". I was not sure what was the action when empty masks are written. Should the empty mask apply the mode to all the online CPUs? > >> >> # Disable kernel-mode steering (back to inherit, default group). > > This sounds like kernel work is steered to default group which I > do not think is accurate for the "inherit_ctrl_and_mon" mode. How about ? Drop the kernel-mode binding and restore inherit_ctrl_and_mon on the default group. thanks Babu
Hi Babu,
While reviewing this series I noticed two cross-architecture issues.
1) aarch64 allyesconfig link failure
resctrl_arch_configure_kmode() and resctrl_arch_get_kmode_support()
are declared in include/linux/resctrl.h:
void resctrl_arch_get_kmode_support(struct resctrl_kmode_cfg *kcfg);
void resctrl_arch_configure_kmode(const struct cpumask *cpu_mask,
u32 closid, u32 rmid, bool enable);
but only implemented under arch/x86/. ARM MPAM selects
CONFIG_RESCTRL_FS, so fs/resctrl/rdtgroup.c is compiled on aarch64
and the linker fails:
ld: fs/resctrl/rdtgroup.o: in function `rdtgroup_config_kmode_clear':
rdtgroup.c: undefined reference to `resctrl_arch_configure_kmode'
ld: fs/resctrl/rdtgroup.o: in function `resctrl_init':
rdtgroup.c: undefined reference to `resctrl_arch_get_kmode_support'
Other arch-specific functions already have empty stubs in
drivers/resctrl/mpam_resctrl.c, e.g.:
int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
{
return -EOPNOTSUPP;
}
Adding the same for the two kmode functions would fix the build.
2) info/kernel_mode visible on non-PLZA platforms
The "kernel_mode" entry is registered with fflags = RFTYPE_TOP_INFO:
{
.name = "kernel_mode",
.mode = 0644,
...
.fflags = RFTYPE_TOP_INFO,
},
This makes the file appear unconditionally under info/ -- even on
platforms without PLZA (ARM MPAM, older AMD/Intel without
X86_FEATURE_PLZA). On those platforms the file only shows the
default inherit_ctrl_and_mon mode, which is confusing since there
are no other modes to switch to.
The io_alloc file handles this by starting with fflags = 0 and
enabling it conditionally:
/* io_alloc: fflags defaults to 0 (hidden) */
static void io_alloc_init(void)
{
if (r->cache.io_alloc_capable) {
resctrl_file_fflags_init("io_alloc", ...);
}
}
The same pattern could work for kernel_mode: set fflags = 0 and
call resctrl_file_fflags_init("kernel_mode", RFTYPE_TOP_INFO)
in resctrl_kmode_init() only when resctrl_arch_get_kmode_support()
registers modes beyond INHERIT_CTRL_AND_MON.
Thanks,
Qinyun
Hi Qinyun,
On 6/8/26 04:23, Qinyun Tan wrote:
> Hi Babu,
>
> While reviewing this series I noticed two cross-architecture issues.
>
> 1) aarch64 allyesconfig link failure
>
> resctrl_arch_configure_kmode() and resctrl_arch_get_kmode_support()
> are declared in include/linux/resctrl.h:
>
> void resctrl_arch_get_kmode_support(struct resctrl_kmode_cfg *kcfg);
> void resctrl_arch_configure_kmode(const struct cpumask *cpu_mask,
> u32 closid, u32 rmid, bool enable);
>
> but only implemented under arch/x86/. ARM MPAM selects
> CONFIG_RESCTRL_FS, so fs/resctrl/rdtgroup.c is compiled on aarch64
> and the linker fails:
>
> ld: fs/resctrl/rdtgroup.o: in function `rdtgroup_config_kmode_clear':
> rdtgroup.c: undefined reference to `resctrl_arch_configure_kmode'
> ld: fs/resctrl/rdtgroup.o: in function `resctrl_init':
> rdtgroup.c: undefined reference to `resctrl_arch_get_kmode_support'
>
> Other arch-specific functions already have empty stubs in
> drivers/resctrl/mpam_resctrl.c, e.g.:
>
> int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
> {
> return -EOPNOTSUPP;
> }
>
> Adding the same for the two kmode functions would fix the build.
Good catch. Will add in empty stubs in next revision for these 3 functions.
>
> 2) info/kernel_mode visible on non-PLZA platforms
>
> The "kernel_mode" entry is registered with fflags = RFTYPE_TOP_INFO:
>
> {
> .name = "kernel_mode",
> .mode = 0644,
> ...
> .fflags = RFTYPE_TOP_INFO,
> },
>
> This makes the file appear unconditionally under info/ -- even on
> platforms without PLZA (ARM MPAM, older AMD/Intel without
> X86_FEATURE_PLZA). On those platforms the file only shows the
> default inherit_ctrl_and_mon mode, which is confusing since there
> are no other modes to switch to.
This is expected. We intended to expose the info/kernel_mode file for
all architectures, and it will display the default mode
inherit_ctrl_and_mon.
This approach was also suggested by Reinette.
https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
Thanks
Babu
在 6/9/26 10:10 PM, Babu Moger 写道: > Good catch. Will add in empty stubs in next revision for these 3 functions. Thanks. > This is expected. We intended to expose the info/kernel_mode file for > all architectures, and it will display the default mode > inherit_ctrl_and_mon. > > This approach was also suggested by Reinette. > https://lore.kernel.org/lkml/2ab556af-095b-422b-9396- > f845c6fd0342@intel.com/ Got it, makes sense. Thanks for the pointer to Reinette's suggestion. Qinyun
Hi Babu,
While reviewing this v3 series I found a few issues in the kernel-mode
(PLZA) support and wrote a fix for each.
I'm sending them as a small follow-up set on top of v3 so they are easy
to fold into the next revision, or to take as separate patches --
whichever you prefer. The patches are ordered by dependency (build fix
-> semantic fix -> the two binding fixes) so the series is bisectable on
top of v3.
Patch 1 (ARM MPAM build fix): fs/resctrl now calls
resctrl_arch_get_kmode_support()/resctrl_arch_configure_kmode(), which
are only implemented on x86, so an aarch64 allyesconfig (MPAM) fails to
link. Add empty arch stubs, and hide info/kernel_mode on platforms that
advertise no mode beyond inherit_ctrl_and_mon.
Patch 2 (RMID_EN + RDTMON_GROUP): RMID_EN is hardcoded to 1, so
inherit_mon counts kernel-mode traffic under the PLZA RMID instead of
inheriting from PQR_ASSOC; and assign_mon is forced to bind an
RDTMON_GROUP, wasting an RMID. Make RMID_EN mode-based and let assign_mon
also accept a control group. This is the issue we discussed earlier and
you confirmed; this is the patch for it.
Patch 3 (atomic switch): resctrl_kernel_mode_write() releases the
previous binding before it programs the new one. If programming the new
binding fails (-ENOMEM, or a pseudo-locked target group), the old,
working binding is already gone -- a user who only tried to switch loses
the original configuration too. Make the switch atomic: all fallible
work is done before the old binding is released, so a failed switch is a
no-op.
Patch 4 (CPU online): the PLZA MSR is per-CPU and is only written over
the CPUs that are online at bind time / mask change; nothing reprograms a
CPU that comes online afterwards. A hot-added vCPU, or a CPU that was
offline at bind time, then runs with PLZA off although it is in scope,
while info/kernel_mode still reports the binding as active. Drive the
per-CPU state from resctrl_online_cpu() so it is synced idempotently on
every online (and stale enable bits are cleared for a CPU that left the
scope while offline).
Concretely, the patch 4 failure mode is: offline a CPU, bind a
global-assign mode while it is absent, then online it -- the onlined CPU
is left with PLZA_EN=0 although it is in scope, while a CPU that was
present at bind time has PLZA_EN=1, so its CPL0 traffic is not accounted
to the bound kernel-mode group.
I'd appreciate your view on whether these match your intent for the
design.
Qinyun Tan (4):
resctrl: Add kmode arch stubs for ARM MPAM and hide kernel_mode on
non-PLZA platforms
resctrl: Fix PLZA RMID_EN to be mode-based and relax RDTMON_GROUP
constraint for assign_mon
fs/resctrl: make a failed kernel-mode switch a no-op
fs/resctrl: program PLZA on a CPU that comes online under a binding
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 9 +-
drivers/resctrl/mpam_resctrl.c | 9 +
fs/resctrl/rdtgroup.c | 235 ++++++++++++++--------
include/linux/resctrl.h | 8 +-
4 files changed, 171 insertions(+), 90 deletions(-)
--
2.43.7
Hi Qinyun, On 6/11/26 06:17, Qinyun Tan wrote: > Hi Babu, > > While reviewing this v3 series I found a few issues in the kernel-mode > (PLZA) support and wrote a fix for each. > I'm sending them as a small follow-up set on top of v3 so they are easy > to fold into the next revision, or to take as separate patches -- > whichever you prefer. The patches are ordered by dependency (build fix > -> semantic fix -> the two binding fixes) so the series is bisectable on > top of v3. > > Patch 1 (ARM MPAM build fix): fs/resctrl now calls > resctrl_arch_get_kmode_support()/resctrl_arch_configure_kmode(), which > are only implemented on x86, so an aarch64 allyesconfig (MPAM) fails to > link. Add empty arch stubs, and hide info/kernel_mode on platforms that > advertise no mode beyond inherit_ctrl_and_mon. > > Patch 2 (RMID_EN + RDTMON_GROUP): RMID_EN is hardcoded to 1, so > inherit_mon counts kernel-mode traffic under the PLZA RMID instead of > inheriting from PQR_ASSOC; and assign_mon is forced to bind an > RDTMON_GROUP, wasting an RMID. Make RMID_EN mode-based and let assign_mon > also accept a control group. This is the issue we discussed earlier and > you confirmed; this is the patch for it. > > Patch 3 (atomic switch): resctrl_kernel_mode_write() releases the > previous binding before it programs the new one. If programming the new > binding fails (-ENOMEM, or a pseudo-locked target group), the old, > working binding is already gone -- a user who only tried to switch loses > the original configuration too. Make the switch atomic: all fallible > work is done before the old binding is released, so a failed switch is a > no-op. > > Patch 4 (CPU online): the PLZA MSR is per-CPU and is only written over > the CPUs that are online at bind time / mask change; nothing reprograms a > CPU that comes online afterwards. A hot-added vCPU, or a CPU that was > offline at bind time, then runs with PLZA off although it is in scope, > while info/kernel_mode still reports the binding as active. Drive the > per-CPU state from resctrl_online_cpu() so it is synced idempotently on > every online (and stale enable bits are cleared for a CPU that left the > scope while offline). > > Concretely, the patch 4 failure mode is: offline a CPU, bind a > global-assign mode while it is absent, then online it -- the onlined CPU > is left with PLZA_EN=0 although it is in scope, while a CPU that was > present at bind time has PLZA_EN=1, so its CPL0 traffic is not accounted > to the bound kernel-mode group. > > I'd appreciate your view on whether these match your intent for the > design. > > Qinyun Tan (4): > resctrl: Add kmode arch stubs for ARM MPAM and hide kernel_mode on > non-PLZA platforms > resctrl: Fix PLZA RMID_EN to be mode-based and relax RDTMON_GROUP > constraint for assign_mon > fs/resctrl: make a failed kernel-mode switch a no-op > fs/resctrl: program PLZA on a CPU that comes online under a binding I have gone thru all your patches. Patches look good to me. I also ran some basic tests to make sure it works as expected. Note the I am still waiting for comments from maintainer (Reinette). I will fold your changes in v4 whenever that happens. Thanks for the review and changes. Thanks Babu
© 2016 - 2026 Red Hat, Inc.