[PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem

Babu Moger posted 16 patches 3 weeks, 4 days ago
.../admin-guide/kernel-parameters.txt         |   2 +-
Documentation/filesystems/resctrl.rst         |  69 ++
arch/x86/include/asm/cpufeatures.h            |   1 +
arch/x86/include/asm/msr-index.h              |   7 +
arch/x86/include/asm/resctrl.h                |  92 ++-
arch/x86/kernel/cpu/resctrl/core.c            |  12 +
arch/x86/kernel/cpu/resctrl/ctrlmondata.c     |  77 +++
arch/x86/kernel/cpu/resctrl/internal.h        |  26 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c        |   2 +
arch/x86/kernel/cpu/scattered.c               |   1 +
fs/resctrl/internal.h                         |   2 +
fs/resctrl/rdtgroup.c                         | 635 +++++++++++++++++-
include/linux/resctrl.h                       |  40 ++
include/linux/resctrl_types.h                 |  30 +
include/linux/sched.h                         |   2 +
15 files changed, 989 insertions(+), 9 deletions(-)
[PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
Posted by Babu Moger 3 weeks, 4 days ago
This series adds support for Privilege-Level Zero Association (PLZA) to the
resctrl subsystem. PLZA is an AMD feature that allows specifying a CLOSID
and/or RMID for execution in kernel mode (privilege level zero), so that
kernel work is not subject to the same resource constrains as the current
user-space task. This avoids kernel operations being aggressively throttled
when a task's memory bandwidth is heavily limited.

The feature documentation is not yet publicly available, but it is expected
to be released in the next few weeks. In the meantime, a brief description
of the features is provided below. 

Privilege Level Zero Association (PLZA) 

Privilege Level Zero Association (PLZA) allows the hardware to
automatically associate execution in Privilege Level Zero (CPL=0) with a
specific COS (Class of Service) and/or RMID (Resource Monitoring
Identifier). The QoS feature set already has a mechanism to associate
execution on each logical processor with an RMID or COS. PLZA allows the
system to override this per-thread association for a thread that is
executing with CPL=0. 
------------------------------------------------------------------------

The series introduces the feature in a way that supports the interface in
a generic manner to accomodate MPAM or other vendor specific implimentation.

Below is the detailed requirements provided by Reinette:
https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/

Summary:
1. Kernel-mode/PLZA controls and status should be exposed under the resctrl
   info directory:/sys/fs/resctrl/info/, not as a separate or arch-specific path.

2. Add two info files

 a. kernel_mode
    Purpose: Control how resource allocation and monitoring apply in kernel mode
    (e.g. inherit from task vs global assign).

    Read: List supported modes and show current one (e.g. with [brackets]).
    Write: Set current mode by name (e.g. inherit_ctrl_and_mon, global_assign_ctrl_assign_mon).

b. kernel_mode_assignment

   Purpose: When a “global assign” kernel mode is active, specify which resctrl group
   (CLOSID/RMID) is used for kernel work.

   Read: Show the assigned group in a path-like form (e.g. //, ctrl1//, ctrl1/mon1/).
   Write: Assign or clear the group used for kernel mode (and optionally clear with an empty write).

The patches are based on top of commit (v7.0.0-rc3)
839e91ce3f41b (tip/master) Merge branch into tip/master: 'x86/tdx'
------------------------------------------------------------------------

Examples: kernel_mode and kernel_mode_assignment

All paths below are under /sys/fs/resctrl/ (e.g. info/kernel_mode means
/sys/fs/resctrl/info/kernel_mode). Resctrl must be mounted and the platform
must support the relevant modes (e.g. AMD with PLZA).

1) kernel_mode — show and set the current kernel mode

   Read supported modes and which one is active (current in brackets):

     $ cat info/kernel_mode
     [inherit_ctrl_and_mon]
     global_assign_ctrl_inherit_mon
     global_assign_ctrl_assign_mon

   Set the active mode (e.g. use one CLOSID+RMID for all kernel work):

     $ echo "global_assign_ctrl_assign_mon" > info/kernel_mode
     $ cat info/kernel_mode
     inherit_ctrl_and_mon
     global_assign_ctrl_inherit_mon
     [global_assign_ctrl_assign_mon]

   Mode meanings:
   - inherit_ctrl_and_mon: kernel uses same CLOSID/RMID as the current task (default).
   - global_assign_ctrl_inherit_mon: one CLOSID for all kernel work; RMID inherited from user.
   - global_assign_ctrl_assign_mon: one resource group (CLOSID+RMID) for all kernel work.

2) kernel_mode_assignment — show and set which group is used for kernel work

   Only relevant when kernel_mode is not "inherit_ctrl_and_mon". Read the
   currently assigned group (path format is "CTRL_MON/MON/"):

     $ cat info/kernel_mode_assignment
     //

   "//" means the default CTRL_MON group is assigned. Assign a specific
   group instead (e.g. a CTRL_MON group "ctrl1", or a MON group "mon1" under it):

     $ echo "ctrl1//" > info/kernel_mode_assignment
     $ cat info/kernel_mode_assignment
     ctrl1//

     $ echo "ctrl1/mon1/" > info/kernel_mode_assignment
     $ cat info/kernel_mode_assignment
     ctrl1/mon1/

   Clear the assignment (no dedicated group for kernel work):

     $ echo >> info/kernel_mode_assignment
     $ cat info/kernel_mode_assignment
     Kmode is not configured

   Errors (e.g. invalid group name or unsupported mode) are reported in
   info/last_cmd_status.

---

v2: 
     This is similar to RFC with new proposal. Names of the some interfaces
     are not final. Lets fix that later as we move forward.

     Separated the two features: Global Bandwidth Enforcement (GLBE) and
     Privilege Level Zero Association (PLZA).
 
     This series only adds support for PLZA.

     Used the name of the feature as kmode instead of PLZA. That can be changed as well.

     Tony suggested using global variables to store the kernel mode
     CLOSID and RMID. However, the kernel mode CLOSID and RMID are
     coming from rdtgroup structure with the new interface. Accessing
     them requires holding the associated lock, which would make the
     context switch path unnecessarily expensive. So, dropped the idea.
     https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
     Let me know if there are other ways to optimize this.

Patch 1: Data structures and arch hook: Add resctrl_kmode,
	resctrl_kmode_cfg, kernel-mode bits, and resctrl_arch_get_kmode_cfg()
	for generic resctrl kernel mode (e.g. PLZA).

Patch 2: Implement resctrl_arch_get_kmode_cfg() on x86, add global resctrl_kcfg
	and resctrl_kmode_init() to set default kmode.

Patch 3: Add info/kernel_mode and resctrl_kernel_mode_show() to list supported
	kernel modes and show the current one in brackets.

Patch 4: Add x86 PLZA support and boot option rdt=plza.

Patch 5: Add supported modes from CPUID.

Patch 6: Add rdt_kmode_enable_key and arch enable/disable helpers so PLZA only
	touches fast paths when enabled.

Patch 7: Add MSR_IA32_PQR_PLZA_ASSOC, bit defines, and union qos_pqr_plza_assoc
	for programming PLZA.

Patch 8: Add Per-CPU and per-task state.

Patch 9: Add resctrl_arch_configure_kmode() and resctrl_arch_set_kmode()
	to program PLZA per domain and set/clear it on a CPU.

Patch 10: In the sched-in path, program MSR_IA32_PQR_PLZA_ASSOC from task or
	per-CPU kmode; only write when kmode changes; guard with rdt_kmode_enable_key.

Patch 11: Add write handler so the current kernel mode can be set by name.

Patch 12: Add info/kernel_mode_assignment and show which rdtgroup is assigned
	for kernel mode in CTRL_MON/MON/ form.

Patch 13: Add write handler to assign/clear the group used for kernel mode;
	enforce single assignment and clear on rmdir.

Patch 14: Update per-CPU PLZA state when its cpu_mask changes (add/remove CPUs)
	via cpus_write_kmode() and helpers.

Patch 15: Refactor so task list respects t->kmode when the group has kmode (PLZA),
	so tasks are shown correctly.

Patch 16: Add arch helper to set task kmode.
--------------------------------------------------------------------------------

v1 : https://lore.kernel.org/lkml/cover.1769029977.git.babu.moger@amd.com/


Babu Moger (16):
  fs/resctrl: Add kernel mode (kmode) data structures and arch hook
  fs, x86/resctrl: Add architecture routines for kernel mode
    initialization
  fs/resctrl: Add info/kernel_mode file to show kernel mode options
  x86/resctrl: Support Privilege-Level Zero Association (PLZA)
  x86/resctrl: Initialize supported kernel modes when CPUID reports PLZA
  resctrl: Introduce kmode static key enable/disable helpers
  x86/resctrl: Add data structures and definitions for PLZA
    configuration
  x86/resctrl: Add per-CPU and per-task kernel mode state
  x86,fs/resctrl: Add the functionality to configure PLZA
  x86/resctrl: Add PLZA state tracking and context switch handling
  fs/resctrl: Add write handler for info/kernel_mode
  fs/resctrl: Add info/kernel_mode_assignment to show kernel-mode
    rdtgroup
  fs/resctrl: Add write interface for kernel_mode_assignment
  fs/resctrl: Update kmode configuration when cpu_mask changes
  x86/resctrl: Refactor show_rdt_tasks() to support PLZA tasks
  fs/resctrl: Add per-task kmode enable support via rdtgroup

 .../admin-guide/kernel-parameters.txt         |   2 +-
 Documentation/filesystems/resctrl.rst         |  69 ++
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/msr-index.h              |   7 +
 arch/x86/include/asm/resctrl.h                |  92 ++-
 arch/x86/kernel/cpu/resctrl/core.c            |  12 +
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c     |  77 +++
 arch/x86/kernel/cpu/resctrl/internal.h        |  26 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c        |   2 +
 arch/x86/kernel/cpu/scattered.c               |   1 +
 fs/resctrl/internal.h                         |   2 +
 fs/resctrl/rdtgroup.c                         | 635 +++++++++++++++++-
 include/linux/resctrl.h                       |  40 ++
 include/linux/resctrl_types.h                 |  30 +
 include/linux/sched.h                         |   2 +
 15 files changed, 989 insertions(+), 9 deletions(-)

-- 
2.43.0

Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
Posted by Reinette Chatre 1 week, 6 days ago
Hi Babu,

On 3/12/26 1:36 PM, Babu Moger wrote:
> 
> This series adds support for Privilege-Level Zero Association (PLZA) to the
> resctrl subsystem. PLZA is an AMD feature that allows specifying a CLOSID
> and/or RMID for execution in kernel mode (privilege level zero), so that
> kernel work is not subject to the same resource constrains as the current
> user-space task. This avoids kernel operations being aggressively throttled
> when a task's memory bandwidth is heavily limited.
> 
> The feature documentation is not yet publicly available, but it is expected
> to be released in the next few weeks. In the meantime, a brief description
> of the features is provided below. 
> 
> Privilege Level Zero Association (PLZA) 
> 
> Privilege Level Zero Association (PLZA) allows the hardware to
> automatically associate execution in Privilege Level Zero (CPL=0) with a
> specific COS (Class of Service) and/or RMID (Resource Monitoring
> Identifier). The QoS feature set already has a mechanism to associate
> execution on each logical processor with an RMID or COS. PLZA allows the
> system to override this per-thread association for a thread that is
> executing with CPL=0. 
> ------------------------------------------------------------------------
> 
> The series introduces the feature in a way that supports the interface in
> a generic manner to accomodate MPAM or other vendor specific implimentation.
> 
> Below is the detailed requirements provided by Reinette:
> https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/

Our discussion considered how resctrl could support PLZA in a generic way while
also preparing to support MPAM's variants and how PLZA may evolve to have similar
capabilities when considering the capabilities of its registers. 

This does not mean that your work needs to implement everything that was discussed.
Instead, this work is expected to just support what PLZA is capable of today but
do so in a way that the future enhancements could be added to.

This series is quite difficult to follow since it appears to implement a full
featured generic interface while PLZA cannot take advantage of it.

Could you please simplify this work to focus on just enabling PLZA and only
add interfaces needed to do so?

> 
> Summary:
> 1. Kernel-mode/PLZA controls and status should be exposed under the resctrl
>    info directory:/sys/fs/resctrl/info/, not as a separate or arch-specific path.
> 
> 2. Add two info files
> 
>  a. kernel_mode
>     Purpose: Control how resource allocation and monitoring apply in kernel mode
>     (e.g. inherit from task vs global assign).
> 
>     Read: List supported modes and show current one (e.g. with [brackets]).
>     Write: Set current mode by name (e.g. inherit_ctrl_and_mon, global_assign_ctrl_assign_mon).
> 
> b. kernel_mode_assignment
> 
>    Purpose: When a “global assign” kernel mode is active, specify which resctrl group
>    (CLOSID/RMID) is used for kernel work.
> 
>    Read: Show the assigned group in a path-like form (e.g. //, ctrl1//, ctrl1/mon1/).
>    Write: Assign or clear the group used for kernel mode (and optionally clear with an empty write).
> 
> The patches are based on top of commit (v7.0.0-rc3)
> 839e91ce3f41b (tip/master) Merge branch into tip/master: 'x86/tdx'
> ------------------------------------------------------------------------
> 
> Examples: kernel_mode and kernel_mode_assignment
> 
> All paths below are under /sys/fs/resctrl/ (e.g. info/kernel_mode means
> /sys/fs/resctrl/info/kernel_mode). Resctrl must be mounted and the platform
> must support the relevant modes (e.g. AMD with PLZA).
> 
> 1) kernel_mode — show and set the current kernel mode
> 
>    Read supported modes and which one is active (current in brackets):
> 
>      $ cat info/kernel_mode
>      [inherit_ctrl_and_mon]
>      global_assign_ctrl_inherit_mon
>      global_assign_ctrl_assign_mon
> 
>    Set the active mode (e.g. use one CLOSID+RMID for all kernel work):
> 
>      $ echo "global_assign_ctrl_assign_mon" > info/kernel_mode
>      $ cat info/kernel_mode
>      inherit_ctrl_and_mon
>      global_assign_ctrl_inherit_mon
>      [global_assign_ctrl_assign_mon]
> 
>    Mode meanings:
>    - inherit_ctrl_and_mon: kernel uses same CLOSID/RMID as the current task (default).
>    - global_assign_ctrl_inherit_mon: one CLOSID for all kernel work; RMID inherited from user.
>    - global_assign_ctrl_assign_mon: one resource group (CLOSID+RMID) for all kernel work.
> 
> 2) kernel_mode_assignment — show and set which group is used for kernel work
> 
>    Only relevant when kernel_mode is not "inherit_ctrl_and_mon". Read the

To help with future usages please connect visibility of this file with the mode in
info/kernel_mode. This helps us to support future modes with other resctrl files, possible
within each resource group.
Specifically, kernel_mode_assignment is not visible to user space if mode is "inherit_ctrl_and_mon",
while it is visible when mode is global_assign_ctrl_inherit_mon or global_assign_ctrl_assign_mon.

>    currently assigned group (path format is "CTRL_MON/MON/"):

The format depends on the mode, right? If the mode is "global_assign_ctrl_inherit_mon"
then it should only contain a control group, alternatively, if the mode is
"global_assign_ctrl_assign_mon" then it contains control and mon group. This gives
resctrl future flexibility to change format for future modes.

We should also consider the scenario when it is a "monitoring only" system, which can
happen independent from what hardware actually supports, for example, if user boots
with "rdt=!l3cat,!l2cat,!mba,!smba". In this case I assume CLOS should just always be
zero and thus only "default control group" is accepted?

> 
>      $ cat info/kernel_mode_assignment
>      //
> 
>    "//" means the default CTRL_MON group is assigned. Assign a specific
>    group instead (e.g. a CTRL_MON group "ctrl1", or a MON group "mon1" under it):
> 
>      $ echo "ctrl1//" > info/kernel_mode_assignment
>      $ cat info/kernel_mode_assignment
>      ctrl1//
> 
>      $ echo "ctrl1/mon1/" > info/kernel_mode_assignment
>      $ cat info/kernel_mode_assignment
>      ctrl1/mon1/
> 
>    Clear the assignment (no dedicated group for kernel work):
> 
>      $ echo >> info/kernel_mode_assignment
>      $ cat info/kernel_mode_assignment
>      Kmode is not configured

This does not look right. Would this not create a conflict between info/kernel_mode
and info/kernel_mode_assignment about what the current mode is? The way I see it
info/kernel_mode_assignment must always contain a valid group.

> 
>    Errors (e.g. invalid group name or unsupported mode) are reported in
>    info/last_cmd_status.
> 
> ---
> 
> v2: 
>      This is similar to RFC with new proposal. Names of the some interfaces
>      are not final. Lets fix that later as we move forward.
> 
>      Separated the two features: Global Bandwidth Enforcement (GLBE) and
>      Privilege Level Zero Association (PLZA).
>  
>      This series only adds support for PLZA.
> 
>      Used the name of the feature as kmode instead of PLZA. That can be changed as well.
> 
>      Tony suggested using global variables to store the kernel mode
>      CLOSID and RMID. However, the kernel mode CLOSID and RMID are
>      coming from rdtgroup structure with the new interface. Accessing
>      them requires holding the associated lock, which would make the
>      context switch path unnecessarily expensive. So, dropped the idea.
>      https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
>      Let me know if there are other ways to optimize this.

I do not see why the context switch path needs to be touched at all with this
implementation. Since PLZA only supports global assignment does it not mean that resctrl
only needs to update PQR_PLZA_ASSOC when user writes to info/kernel_mode and
info/kernel_mode_assignment?

Consider some of the scenarios:

resctrl mount with default state:

	# cat info/kernel_mode
	[inherit_ctrl_and_mon]
	global_assign_ctrl_inherit_mon
	global_assign_ctrl_assign_mon
	# ls info/kernel_mode_assignment
	ls: cannot access 'info/kernel_mode_assignment': No such file or directory

enable global_assign_ctrl_assign_mon mode:
	# echo "global_assign_ctrl_assign_mon" > info/kernel_mode

Expectation here is that when user space sets this mode as above then resctrl would
in turn program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
	MSR_IA32_PQR_PLZA_ASSOC.rmid=0
	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
	MSR_IA32_PQR_PLZA_ASSOC.closid=0
	MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
	MSR_IA32_PQR_PLZA_ASSOC.plza_en=1

I do not see why it is necessary to maintain any per-CPU or per-task state or needing
to touch the context switch code. Since PLZA only supports global could it not
just set MSR_IA32_PQR_PLZA_ASSOC on all online CPUs and be done with it?
Only caveat is that if a CPU is offline then this setting needs to be stashed
so that MSR_IA32_PQR_PLZA_ASSOC can be set when new CPU comes online.

The way that rdtgroup_config_kmode() introduced in patch #11 assumes it is dealing
with RDT_RESOURCE_L3 and traverses the resource domain list and resource group
CPU mask seems unnecessary to me as well as error prone since the system may only
have, for example, RDT_RESOURCE_MBA enabled or even just monitoring. Why not just set
MSR_IA32_PQR_PLZA_ASSOC on all CPUs and be done?

To continue the scenarios ...

After user's setting above related files read:
	# cat info/kernel_mode
	inherit_ctrl_and_mon
	global_assign_ctrl_inherit_mon
	[global_assign_ctrl_assign_mon]
	# cat info/kernel_mode_assignment
	//

Modify group used by global_assign_ctrl_assign_mon mode:
	# echo 'ctrl1/mon1/' > info/kernel_mode_assignment

Expectation here is that when user space sets this then resctrl would
program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
	MSR_IA32_PQR_PLZA_ASSOC.rmid=<rmid of mon1>
	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
	MSR_IA32_PQR_PLZA_ASSOC.closid=<closid of ctrl1>
	MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
	MSR_IA32_PQR_PLZA_ASSOC.plza_en=1

Enable global_assign_ctrl_inherit_mon mode:
	# echo "global_assign_ctrl_inherit_mon" > info/kernel_mode

Expectation here is that when user space sets this mode then resctrl would
program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
	MSR_IA32_PQR_PLZA_ASSOC.rmid=0
	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=0
	MSR_IA32_PQR_PLZA_ASSOC.closid=0
	MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
	MSR_IA32_PQR_PLZA_ASSOC.plza_en=1

	# cat info/kernel_mode
	inherit_ctrl_and_mon
	[global_assign_ctrl_inherit_mon]
	global_assign_ctrl_assign_mon
	# cat info/kernel_mode_assignment <==== returns just a ctrl group
	/

Modify group used by global_assign_ctrl_inherit_mon mode:
	# echo ctrl1 > info/kernel_mode_assignment

Expectation here is that when user space sets this then resctrl would
program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
	MSR_IA32_PQR_PLZA_ASSOC.rmid=0
	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=0
	MSR_IA32_PQR_PLZA_ASSOC.closid=<closid of ctrl1>
	MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
	MSR_IA32_PQR_PLZA_ASSOC.plza_en=1

	# cat info/kernel_mode_assignment <==== returns just a ctrl group
	ctrl/

Enable inherit_ctrl_and_mon mode:
	# echo "inherit_ctrl_and_mon" > info/kernel_mode

Expectation here is that when user space sets this mode then resctrl would
program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
	MSR_IA32_PQR_PLZA_ASSOC.rmid=0
	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=0
	MSR_IA32_PQR_PLZA_ASSOC.closid=0
	MSR_IA32_PQR_PLZA_ASSOC.closid_en=0
	MSR_IA32_PQR_PLZA_ASSOC.plza_en=0

At this point info/kernel_mode_assignment is not visible anymore:

	# ls info/kernel_mode_assignment
	ls: cannot access 'info/kernel_mode_assignment': No such file or directory

From what I understand above exposes and enables full capability of PLZA. All the other
per-task and per-cpu handling in this series is not something that PLZA can benefit from. 
If this is not the case, what am I missing? Could this series be simplified to just support
PLZA today? When next hardware with more capability needs to be supported resctrl could be
enhanced to support it by using the more accurate information about what the hardware is
capable of.

We also do not really know what use cases users prefer. This may even be sufficient.

Reinette
Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
Posted by Babu Moger 1 week, 4 days ago
Hi Reinette,

Thanks for the review comments. Will address one by one.

On 3/24/26 17:51, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/12/26 1:36 PM, Babu Moger wrote:
>> This series adds support for Privilege-Level Zero Association (PLZA) to the
>> resctrl subsystem. PLZA is an AMD feature that allows specifying a CLOSID
>> and/or RMID for execution in kernel mode (privilege level zero), so that
>> kernel work is not subject to the same resource constrains as the current
>> user-space task. This avoids kernel operations being aggressively throttled
>> when a task's memory bandwidth is heavily limited.
>>
>> The feature documentation is not yet publicly available, but it is expected
>> to be released in the next few weeks. In the meantime, a brief description
>> of the features is provided below.
>>
>> Privilege Level Zero Association (PLZA)
>>
>> Privilege Level Zero Association (PLZA) allows the hardware to
>> automatically associate execution in Privilege Level Zero (CPL=0) with a
>> specific COS (Class of Service) and/or RMID (Resource Monitoring
>> Identifier). The QoS feature set already has a mechanism to associate
>> execution on each logical processor with an RMID or COS. PLZA allows the
>> system to override this per-thread association for a thread that is
>> executing with CPL=0.
>> ------------------------------------------------------------------------
>>
>> The series introduces the feature in a way that supports the interface in
>> a generic manner to accomodate MPAM or other vendor specific implimentation.
>>
>> Below is the detailed requirements provided by Reinette:
>> https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
> Our discussion considered how resctrl could support PLZA in a generic way while
> also preparing to support MPAM's variants and how PLZA may evolve to have similar
> capabilities when considering the capabilities of its registers.
>
> This does not mean that your work needs to implement everything that was discussed.
> Instead, this work is expected to just support what PLZA is capable of today but
> do so in a way that the future enhancements could be added to.
>
> This series is quite difficult to follow since it appears to implement a full
> featured generic interface while PLZA cannot take advantage of it.
>
> Could you please simplify this work to focus on just enabling PLZA and only
> add interfaces needed to do so?
Sure. Will try. Lets continue the discussion.
>
>> Summary:
>> 1. Kernel-mode/PLZA controls and status should be exposed under the resctrl
>>     info directory:/sys/fs/resctrl/info/, not as a separate or arch-specific path.
>>
>> 2. Add two info files
>>
>>   a. kernel_mode
>>      Purpose: Control how resource allocation and monitoring apply in kernel mode
>>      (e.g. inherit from task vs global assign).
>>
>>      Read: List supported modes and show current one (e.g. with [brackets]).
>>      Write: Set current mode by name (e.g. inherit_ctrl_and_mon, global_assign_ctrl_assign_mon).
>>
>> b. kernel_mode_assignment
>>
>>     Purpose: When a “global assign” kernel mode is active, specify which resctrl group
>>     (CLOSID/RMID) is used for kernel work.
>>
>>     Read: Show the assigned group in a path-like form (e.g. //, ctrl1//, ctrl1/mon1/).
>>     Write: Assign or clear the group used for kernel mode (and optionally clear with an empty write).
>>
>> The patches are based on top of commit (v7.0.0-rc3)
>> 839e91ce3f41b (tip/master) Merge branch into tip/master: 'x86/tdx'
>> ------------------------------------------------------------------------
>>
>> Examples: kernel_mode and kernel_mode_assignment
>>
>> All paths below are under /sys/fs/resctrl/ (e.g. info/kernel_mode means
>> /sys/fs/resctrl/info/kernel_mode). Resctrl must be mounted and the platform
>> must support the relevant modes (e.g. AMD with PLZA).
>>
>> 1) kernel_mode — show and set the current kernel mode
>>
>>     Read supported modes and which one is active (current in brackets):
>>
>>       $ cat info/kernel_mode
>>       [inherit_ctrl_and_mon]
>>       global_assign_ctrl_inherit_mon
>>       global_assign_ctrl_assign_mon
>>
>>     Set the active mode (e.g. use one CLOSID+RMID for all kernel work):
>>
>>       $ echo "global_assign_ctrl_assign_mon" > info/kernel_mode
>>       $ cat info/kernel_mode
>>       inherit_ctrl_and_mon
>>       global_assign_ctrl_inherit_mon
>>       [global_assign_ctrl_assign_mon]
>>
>>     Mode meanings:
>>     - inherit_ctrl_and_mon: kernel uses same CLOSID/RMID as the current task (default).
>>     - global_assign_ctrl_inherit_mon: one CLOSID for all kernel work; RMID inherited from user.
>>     - global_assign_ctrl_assign_mon: one resource group (CLOSID+RMID) for all kernel work.
>>
>> 2) kernel_mode_assignment — show and set which group is used for kernel work
>>
>>     Only relevant when kernel_mode is not "inherit_ctrl_and_mon". Read the
> To help with future usages please connect visibility of this file with the mode in
> info/kernel_mode. This helps us to support future modes with other resctrl files, possible
> within each resource group.
> Specifically, kernel_mode_assignment is not visible to user space if mode is "inherit_ctrl_and_mon",
> while it is visible when mode is global_assign_ctrl_inherit_mon or global_assign_ctrl_assign_mon.

Sure. Will do.

>
>>     currently assigned group (path format is "CTRL_MON/MON/"):
> The format depends on the mode, right? If the mode is "global_assign_ctrl_inherit_mon"
> then it should only contain a control group, alternatively, if the mode is
> "global_assign_ctrl_assign_mon" then it contains control and mon group. This gives
> resctrl future flexibility to change format for future modes.

This can be done both ways.  Whole purpose of these groups is to get 
CLOSID and RMID to enable PLZA. User can echo CTRL_MON or MON group to 
kernel_mode_assignment in any of the modes.  We can decide what needs to 
be updated in MSR (PQR_PLZA_ASSOC) based on what kernel mode is selected.


>
> We should also consider the scenario when it is a "monitoring only" system, which can
> happen independent from what hardware actually supports, for example, if user boots
> with "rdt=!l3cat,!l2cat,!mba,!smba". In this case I assume CLOS should just always be
> zero and thus only "default control group" is accepted?

Yes.  It depends on how we want to implement like we mentioned above.


>
>>       $ cat info/kernel_mode_assignment
>>       //
>>
>>     "//" means the default CTRL_MON group is assigned. Assign a specific
>>     group instead (e.g. a CTRL_MON group "ctrl1", or a MON group "mon1" under it):
>>
>>       $ echo "ctrl1//" > info/kernel_mode_assignment
>>       $ cat info/kernel_mode_assignment
>>       ctrl1//
>>
>>       $ echo "ctrl1/mon1/" > info/kernel_mode_assignment
>>       $ cat info/kernel_mode_assignment
>>       ctrl1/mon1/
>>
>>     Clear the assignment (no dedicated group for kernel work):
>>
>>       $ echo >> info/kernel_mode_assignment
>>       $ cat info/kernel_mode_assignment
>>       Kmode is not configured
> This does not look right. Would this not create a conflict between info/kernel_mode
> and info/kernel_mode_assignment about what the current mode is? The way I see it
> info/kernel_mode_assignment must always contain a valid group.
Yes.  We can do that.
>
>>     Errors (e.g. invalid group name or unsupported mode) are reported in
>>     info/last_cmd_status.
>>
>> ---
>>
>> v2:
>>       This is similar to RFC with new proposal. Names of the some interfaces
>>       are not final. Lets fix that later as we move forward.
>>
>>       Separated the two features: Global Bandwidth Enforcement (GLBE) and
>>       Privilege Level Zero Association (PLZA).
>>   
>>       This series only adds support for PLZA.
>>
>>       Used the name of the feature as kmode instead of PLZA. That can be changed as well.
>>
>>       Tony suggested using global variables to store the kernel mode
>>       CLOSID and RMID. However, the kernel mode CLOSID and RMID are
>>       coming from rdtgroup structure with the new interface. Accessing
>>       them requires holding the associated lock, which would make the
>>       context switch path unnecessarily expensive. So, dropped the idea.
>>       https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
>>       Let me know if there are other ways to optimize this.
> I do not see why the context switch path needs to be touched at all with this
> implementation. Since PLZA only supports global assignment does it not mean that resctrl
> only needs to update PQR_PLZA_ASSOC when user writes to info/kernel_mode and
> info/kernel_mode_assignment?

Each thread has an MSR to configure whether to associate privilege level 
zero execution with a separate COS and/or RMID, and the value of the COS 
and/or RMID.  PLZA may be enabled or disabled on a per-thread 
basis. However, the COS and RMID association and configuration must be 
the same for all threads in the QOS Domain.

So, PQR_PLZA_ASSOC is a per thread MSR just like PQR_ASSOC.

Privilege-Level Zero Association (PLZA) allows the user to specify a COS 
and/or RMID associated with execution in Privilege-Level Zero. When 
enabled on a HW thread, when that thread enters Privilige-Level Zero, 
transactions associated with that thread will be associated with the 
PLZA COS and/or RMID. Otherwise, the HW thread will be associated with 
the COS and RMID identified by  PQR_ASSOC.

More below.

>
> Consider some of the scenarios:
>
> resctrl mount with default state:
>
> 	# cat info/kernel_mode
> 	[inherit_ctrl_and_mon]
> 	global_assign_ctrl_inherit_mon
> 	global_assign_ctrl_assign_mon
> 	# ls info/kernel_mode_assignment
> 	ls: cannot access 'info/kernel_mode_assignment': No such file or directory
>
> enable global_assign_ctrl_assign_mon mode:
> 	# echo "global_assign_ctrl_assign_mon" > info/kernel_mode
>
> Expectation here is that when user space sets this mode as above then resctrl would
> in turn program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid=0
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
> 	MSR_IA32_PQR_PLZA_ASSOC.closid=0
> 	MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
> 	MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
>
> I do not see why it is necessary to maintain any per-CPU or per-task state or needing
> to touch the context switch code. Since PLZA only supports global could it not
> just set MSR_IA32_PQR_PLZA_ASSOC on all online CPUs and be done with it?
> Only caveat is that if a CPU is offline then this setting needs to be stashed
> so that MSR_IA32_PQR_PLZA_ASSOC can be set when new CPU comes online.
>
> The way that rdtgroup_config_kmode() introduced in patch #11 assumes it is dealing
> with RDT_RESOURCE_L3 and traverses the resource domain list and resource group
> CPU mask seems unnecessary to me as well as error prone since the system may only
> have, for example, RDT_RESOURCE_MBA enabled or even just monitoring. Why not just set
> MSR_IA32_PQR_PLZA_ASSOC on all CPUs and be done?
>
> To continue the scenarios ...
>
> After user's setting above related files read:
> 	# cat info/kernel_mode
> 	inherit_ctrl_and_mon
> 	global_assign_ctrl_inherit_mon
> 	[global_assign_ctrl_assign_mon]
> 	# cat info/kernel_mode_assignment
> 	//
>
> Modify group used by global_assign_ctrl_assign_mon mode:
> 	# echo 'ctrl1/mon1/' > info/kernel_mode_assignment
>
> Expectation here is that when user space sets this then resctrl would
> program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid=<rmid of mon1>
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
> 	MSR_IA32_PQR_PLZA_ASSOC.closid=<closid of ctrl1>
> 	MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
> 	MSR_IA32_PQR_PLZA_ASSOC.plza_en=1


This works correctly when PLZA associations are defined by per CPU. For 
example, lets assume that *ctrl1* is assigned *CLOSID 1*.

In this scenario, every task in the system running on a any CPU will use 
the limits associated with *CLOSID 1* whenever it enters Privilege-Level 
Zero, because the CPU's *PQR_PLZA_ASSOC* register has PLZA enabled and 
CLOSID is 1.

Now consider task-based association:

We have two resctrl groups:

  * *ctrl1 -> CLOSID 1 -> task1.plza = 1   : *User wants PLZA be enabled
    for this task.
  * *ctrl2 -> CLOSID 2 -> task2.plza = 0   : *User wants PLZA
    disabled for this task.

Suppose *task1* is first scheduled on *CPU 0*. This behaves as expected: 
since CPU 0 's *PQR_PLZA_ASSOC* contains *CLOSID 1, plza_en =1*, task1 
will use the limits from CLOSID 1 when it enters Privilege-Level Zero.

However, if *task2* later runs on *CPU 0*, we expect it to use *CLOSID 
2* in both user mode and kernel mode, because user has PLZA disabled for 
this task. But CPU 0 still has *CLOSID 1, **plza_en =1* in its 
PQR_PLZA_ASSOC register.

As a result, task2 will incorrectly run with *CLOSID 1* when entering 
Privilege-Level Zero something we explicitly want to avoid.

At that point, PLZA must be disabled on CPU 0 to prevent the unintended 
association. Hope this explanation makes the issue clear.

Thanks

Babu

>
> Enable global_assign_ctrl_inherit_mon mode:
> 	# echo "global_assign_ctrl_inherit_mon" > info/kernel_mode
>
> Expectation here is that when user space sets this mode then resctrl would
> program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid=0
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=0
> 	MSR_IA32_PQR_PLZA_ASSOC.closid=0
> 	MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
> 	MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
>
> 	# cat info/kernel_mode
> 	inherit_ctrl_and_mon
> 	[global_assign_ctrl_inherit_mon]
> 	global_assign_ctrl_assign_mon
> 	# cat info/kernel_mode_assignment <==== returns just a ctrl group
> 	/
>
> Modify group used by global_assign_ctrl_inherit_mon mode:
> 	# echo ctrl1 > info/kernel_mode_assignment
>
> Expectation here is that when user space sets this then resctrl would
> program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid=0
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=0
> 	MSR_IA32_PQR_PLZA_ASSOC.closid=<closid of ctrl1>
> 	MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
> 	MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
>
> 	# cat info/kernel_mode_assignment <==== returns just a ctrl group
> 	ctrl/
>
> Enable inherit_ctrl_and_mon mode:
> 	# echo "inherit_ctrl_and_mon" > info/kernel_mode
>
> Expectation here is that when user space sets this mode then resctrl would
> program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid=0
> 	MSR_IA32_PQR_PLZA_ASSOC.rmid_en=0
> 	MSR_IA32_PQR_PLZA_ASSOC.closid=0
> 	MSR_IA32_PQR_PLZA_ASSOC.closid_en=0
> 	MSR_IA32_PQR_PLZA_ASSOC.plza_en=0
>
> At this point info/kernel_mode_assignment is not visible anymore:
>
> 	# ls info/kernel_mode_assignment
> 	ls: cannot access 'info/kernel_mode_assignment': No such file or directory
>
> >From what I understand above exposes and enables full capability of PLZA. All the other
> per-task and per-cpu handling in this series is not something that PLZA can benefit from.
> If this is not the case, what am I missing? Could this series be simplified to just support
> PLZA today? When next hardware with more capability needs to be supported resctrl could be
> enhanced to support it by using the more accurate information about what the hardware is
> capable of.
>
> We also do not really know what use cases users prefer. This may even be sufficient.
>
> Reinette
>
Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
Posted by Reinette Chatre 1 week, 3 days ago
Hi Babu,

On 3/26/26 10:12 AM, Babu Moger wrote:
> Hi Reinette,
> 
> Thanks for the review comments. Will address one by one.
> 
> On 3/24/26 17:51, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 3/12/26 1:36 PM, Babu Moger wrote:
>>> This series adds support for Privilege-Level Zero Association (PLZA) to the
>>> resctrl subsystem. PLZA is an AMD feature that allows specifying a CLOSID
>>> and/or RMID for execution in kernel mode (privilege level zero), so that
>>> kernel work is not subject to the same resource constrains as the current
>>> user-space task. This avoids kernel operations being aggressively throttled
>>> when a task's memory bandwidth is heavily limited.
>>>
>>> The feature documentation is not yet publicly available, but it is expected
>>> to be released in the next few weeks. In the meantime, a brief description
>>> of the features is provided below.
>>>
>>> Privilege Level Zero Association (PLZA)
>>>
>>> Privilege Level Zero Association (PLZA) allows the hardware to
>>> automatically associate execution in Privilege Level Zero (CPL=0) with a
>>> specific COS (Class of Service) and/or RMID (Resource Monitoring
>>> Identifier). The QoS feature set already has a mechanism to associate
>>> execution on each logical processor with an RMID or COS. PLZA allows the
>>> system to override this per-thread association for a thread that is
>>> executing with CPL=0.
>>> ------------------------------------------------------------------------
>>>
>>> The series introduces the feature in a way that supports the interface in
>>> a generic manner to accomodate MPAM or other vendor specific implimentation.
>>>
>>> Below is the detailed requirements provided by Reinette:
>>> https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
>> Our discussion considered how resctrl could support PLZA in a generic way while
>> also preparing to support MPAM's variants and how PLZA may evolve to have similar
>> capabilities when considering the capabilities of its registers.
>>
>> This does not mean that your work needs to implement everything that was discussed.
>> Instead, this work is expected to just support what PLZA is capable of today but
>> do so in a way that the future enhancements could be added to.
>>
>> This series is quite difficult to follow since it appears to implement a full
>> featured generic interface while PLZA cannot take advantage of it.
>>
>> Could you please simplify this work to focus on just enabling PLZA and only
>> add interfaces needed to do so?
> Sure. Will try. Lets continue the discussion.
>>
>>> Summary:
>>> 1. Kernel-mode/PLZA controls and status should be exposed under the resctrl
>>>     info directory:/sys/fs/resctrl/info/, not as a separate or arch-specific path.
>>>
>>> 2. Add two info files
>>>
>>>   a. kernel_mode
>>>      Purpose: Control how resource allocation and monitoring apply in kernel mode
>>>      (e.g. inherit from task vs global assign).
>>>
>>>      Read: List supported modes and show current one (e.g. with [brackets]).
>>>      Write: Set current mode by name (e.g. inherit_ctrl_and_mon, global_assign_ctrl_assign_mon).
>>>
>>> b. kernel_mode_assignment
>>>
>>>     Purpose: When a “global assign” kernel mode is active, specify which resctrl group
>>>     (CLOSID/RMID) is used for kernel work.
>>>
>>>     Read: Show the assigned group in a path-like form (e.g. //, ctrl1//, ctrl1/mon1/).
>>>     Write: Assign or clear the group used for kernel mode (and optionally clear with an empty write).
>>>
>>> The patches are based on top of commit (v7.0.0-rc3)
>>> 839e91ce3f41b (tip/master) Merge branch into tip/master: 'x86/tdx'
>>> ------------------------------------------------------------------------
>>>
>>> Examples: kernel_mode and kernel_mode_assignment
>>>
>>> All paths below are under /sys/fs/resctrl/ (e.g. info/kernel_mode means
>>> /sys/fs/resctrl/info/kernel_mode). Resctrl must be mounted and the platform
>>> must support the relevant modes (e.g. AMD with PLZA).
>>>
>>> 1) kernel_mode — show and set the current kernel mode
>>>
>>>     Read supported modes and which one is active (current in brackets):
>>>
>>>       $ cat info/kernel_mode
>>>       [inherit_ctrl_and_mon]
>>>       global_assign_ctrl_inherit_mon
>>>       global_assign_ctrl_assign_mon
>>>
>>>     Set the active mode (e.g. use one CLOSID+RMID for all kernel work):
>>>
>>>       $ echo "global_assign_ctrl_assign_mon" > info/kernel_mode
>>>       $ cat info/kernel_mode
>>>       inherit_ctrl_and_mon
>>>       global_assign_ctrl_inherit_mon
>>>       [global_assign_ctrl_assign_mon]
>>>
>>>     Mode meanings:
>>>     - inherit_ctrl_and_mon: kernel uses same CLOSID/RMID as the current task (default).
>>>     - global_assign_ctrl_inherit_mon: one CLOSID for all kernel work; RMID inherited from user.
>>>     - global_assign_ctrl_assign_mon: one resource group (CLOSID+RMID) for all kernel work.
>>>
>>> 2) kernel_mode_assignment — show and set which group is used for kernel work
>>>
>>>     Only relevant when kernel_mode is not "inherit_ctrl_and_mon". Read the
>> To help with future usages please connect visibility of this file with the mode in
>> info/kernel_mode. This helps us to support future modes with other resctrl files, possible
>> within each resource group.
>> Specifically, kernel_mode_assignment is not visible to user space if mode is "inherit_ctrl_and_mon",
>> while it is visible when mode is global_assign_ctrl_inherit_mon or global_assign_ctrl_assign_mon.
> 
> Sure. Will do.
> 
>>
>>>     currently assigned group (path format is "CTRL_MON/MON/"):
>> The format depends on the mode, right? If the mode is "global_assign_ctrl_inherit_mon"
>> then it should only contain a control group, alternatively, if the mode is
>> "global_assign_ctrl_assign_mon" then it contains control and mon group. This gives
>> resctrl future flexibility to change format for future modes.
> 
> This can be done both ways.  Whole purpose of these groups is to get CLOSID and RMID to enable PLZA. User can echo CTRL_MON or MON group to kernel_mode_assignment in any of the modes.  We can decide what needs to be updated in MSR (PQR_PLZA_ASSOC) based on what kernel mode is selected.

The "both ways" are specific to one of the two active modes though.
PLZA only needs the RMID when the mode is "global_assign_ctrl_assign_mon".

Displaying and parsing monitor group when the mode is
"global_assign_ctrl_inherit_mon" creates an inconsistent interface since the mode
only uses a control group. The interface to user space should match the mode otherwise
it becomes confusing.

...


>>>
>>>       Tony suggested using global variables to store the kernel mode
>>>       CLOSID and RMID. However, the kernel mode CLOSID and RMID are
>>>       coming from rdtgroup structure with the new interface. Accessing
>>>       them requires holding the associated lock, which would make the
>>>       context switch path unnecessarily expensive. So, dropped the idea.
>>>       https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
>>>       Let me know if there are other ways to optimize this.
>> I do not see why the context switch path needs to be touched at all with this
>> implementation. Since PLZA only supports global assignment does it not mean that resctrl
>> only needs to update PQR_PLZA_ASSOC when user writes to info/kernel_mode and
>> info/kernel_mode_assignment?
> 
> Each thread has an MSR to configure whether to associate privilege level zero execution with a separate COS and/or RMID, and the value of the COS and/or RMID.  PLZA may be enabled or disabled on a per-thread basis. However, the COS and RMID association and configuration must be the same for all threads in the QOS Domain.

Based on previous comment in https://lore.kernel.org/lkml/abb049fa-3a3d-4601-9ae3-61eeb7fd8fcf@amd.com/ 
and this implementation all fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_en must be the
same for all CPUs on the system, not just per QoS domain. Could you please confirm?

> 
> So, PQR_PLZA_ASSOC is a per thread MSR just like PQR_ASSOC.
> 
> Privilege-Level Zero Association (PLZA) allows the user to specify a COS and/or RMID associated with execution in Privilege-Level Zero. When enabled on a HW thread, when that thread enters Privilige-Level Zero, transactions associated with that thread will be associated with the PLZA COS and/or RMID. Otherwise, the HW thread will be associated with the COS and RMID identified by  PQR_ASSOC.
> 
> More below.
> 
>>
>> Consider some of the scenarios:
>>
>> resctrl mount with default state:
>>
>>     # cat info/kernel_mode
>>     [inherit_ctrl_and_mon]
>>     global_assign_ctrl_inherit_mon
>>     global_assign_ctrl_assign_mon
>>     # ls info/kernel_mode_assignment
>>     ls: cannot access 'info/kernel_mode_assignment': No such file or directory
>>
>> enable global_assign_ctrl_assign_mon mode:
>>     # echo "global_assign_ctrl_assign_mon" > info/kernel_mode
>>
>> Expectation here is that when user space sets this mode as above then resctrl would
>> in turn program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
>>     MSR_IA32_PQR_PLZA_ASSOC.rmid=0
>>     MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
>>     MSR_IA32_PQR_PLZA_ASSOC.closid=0
>>     MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
>>     MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
>>
>> I do not see why it is necessary to maintain any per-CPU or per-task state or needing
>> to touch the context switch code. Since PLZA only supports global could it not
>> just set MSR_IA32_PQR_PLZA_ASSOC on all online CPUs and be done with it?
>> Only caveat is that if a CPU is offline then this setting needs to be stashed
>> so that MSR_IA32_PQR_PLZA_ASSOC can be set when new CPU comes online.
>>
>> The way that rdtgroup_config_kmode() introduced in patch #11 assumes it is dealing
>> with RDT_RESOURCE_L3 and traverses the resource domain list and resource group
>> CPU mask seems unnecessary to me as well as error prone since the system may only
>> have, for example, RDT_RESOURCE_MBA enabled or even just monitoring. Why not just set
>> MSR_IA32_PQR_PLZA_ASSOC on all CPUs and be done?
>>
>> To continue the scenarios ...
>>
>> After user's setting above related files read:
>>     # cat info/kernel_mode
>>     inherit_ctrl_and_mon
>>     global_assign_ctrl_inherit_mon
>>     [global_assign_ctrl_assign_mon]
>>     # cat info/kernel_mode_assignment
>>     //
>>
>> Modify group used by global_assign_ctrl_assign_mon mode:
>>     # echo 'ctrl1/mon1/' > info/kernel_mode_assignment
>>
>> Expectation here is that when user space sets this then resctrl would
>> program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
>>     MSR_IA32_PQR_PLZA_ASSOC.rmid=<rmid of mon1>
>>     MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
>>     MSR_IA32_PQR_PLZA_ASSOC.closid=<closid of ctrl1>
>>     MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
>>     MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
> 
> 
> This works correctly when PLZA associations are defined by per CPU. For example, lets assume that *ctrl1* is assigned *CLOSID 1*.
> 
> In this scenario, every task in the system running on a any CPU will use the limits associated with *CLOSID 1* whenever it enters Privilege-Level Zero, because the CPU's *PQR_PLZA_ASSOC* register has PLZA enabled and CLOSID is 1.
> 
> Now consider task-based association:
> 
> We have two resctrl groups:
> 
>  * *ctrl1 -> CLOSID 1 -> task1.plza = 1   : *User wants PLZA be enabled
>    for this task.
>  * *ctrl2 -> CLOSID 2 -> task2.plza = 0   : *User wants PLZA
>    disabled for this task.
> 
> Suppose *task1* is first scheduled on *CPU 0*. This behaves as expected: since CPU 0 's *PQR_PLZA_ASSOC* contains *CLOSID 1, plza_en =1*, task1 will use the limits from CLOSID 1 when it enters Privilege-Level Zero.
> 
> However, if *task2* later runs on *CPU 0*, we expect it to use *CLOSID 2* in both user mode and kernel mode, because user has PLZA disabled for this task. But CPU 0 still has *CLOSID 1, **plza_en =1* in its PQR_PLZA_ASSOC register.
> 
> As a result, task2 will incorrectly run with *CLOSID 1* when entering Privilege-Level Zero something we explicitly want to avoid.
> 
> At that point, PLZA must be disabled on CPU 0 to prevent the unintended association. Hope this explanation makes the issue clear.
> 

A couple of points:
- Looks like we still need to come to agreement what is meant by "global" when it
  comes to kernel mode.

  In your description there is a "global" configuration, but the assignment is "per-task".
  To me this sounds like a new and distinct kernel_mode from the "global" modes
  considered so far. This seems to move to the "per_task" mode mentioned in but
  the implementation does not take into account any of the earlier discussions
  surrounding it:
  https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/

  We only learned about one use case in https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@mail.gmail.com/
  As I understand this use case requires PLZA globally enabled for all tasks. Thus
  I consider task assignment to be "global" when in the "global_*" kernel modes.
  If this is indeed a common use case then supporting only global configuration
  but then requiring user space to manually assign all tasks afterwards sounds
  cumbersome for user space and also detrimental to system performance with all
  the churn to modify all the task_structs involved. The accompanying documentation
  does not mention all this additional user space interactions required by user
  space to use this implementation. 

  I find this implementation difficult and inefficient to use in the one use case
  we know of. I would suggest that resctrl optimizes for the one known use case.

- This implementation ignores discussion on how existing resctrl files should
  not be repurposed.

  This implementation allows user space to set a resource group in
  kernel_mode_assignment with the consequence that this resource group's
  "tasks" file changes behavior. I consider this a break of resctrl interface.
  We did briefly consider per-task configuration/assignment in previous discussion
  and the proposal was for it to use a new file (only when and if needed!).

- Now a user is required to write the task id of every task that participates
  in PLZA. Apart from the churn already mentioned this also breaks existing
  usage since it is no longer possible for new tasks to be added to this
  resource group. This creates an awkward interface where all tasks belonging
  to a resource group inherits the allocations/monitoring for their user space
  work and will get PLZA enabled whether user requested it or not while
  tasks from other resource groups need to be explicitly enabled. This creates
  an inconsistency when it comes to task assignment. The only way to "remove"
  PLZA from such a task would be to assign it to another resource group which
  may not have the user space allocations ... and once this is done the task
  cannot be moved back.
  There is no requirement that CLOSID/RMID should be dedicated to kernel work
  but this implementation does so in an inconsistent way.

- Apart from the same issues as with repurposing of tasks file, why should same
  CPU allocation be used for kernel and user space? 

Reinette
Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
Posted by Babu Moger 1 week ago
Hi Reinette,

On 3/27/26 17:11, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/26/26 10:12 AM, Babu Moger wrote:
>> Hi Reinette,
>>
>> Thanks for the review comments. Will address one by one.
>>
>> On 3/24/26 17:51, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 3/12/26 1:36 PM, Babu Moger wrote:
>>>> This series adds support for Privilege-Level Zero Association (PLZA) to the
>>>> resctrl subsystem. PLZA is an AMD feature that allows specifying a CLOSID
>>>> and/or RMID for execution in kernel mode (privilege level zero), so that
>>>> kernel work is not subject to the same resource constrains as the current
>>>> user-space task. This avoids kernel operations being aggressively throttled
>>>> when a task's memory bandwidth is heavily limited.
>>>>
>>>> The feature documentation is not yet publicly available, but it is expected
>>>> to be released in the next few weeks. In the meantime, a brief description
>>>> of the features is provided below.
>>>>
>>>> Privilege Level Zero Association (PLZA)
>>>>
>>>> Privilege Level Zero Association (PLZA) allows the hardware to
>>>> automatically associate execution in Privilege Level Zero (CPL=0) with a
>>>> specific COS (Class of Service) and/or RMID (Resource Monitoring
>>>> Identifier). The QoS feature set already has a mechanism to associate
>>>> execution on each logical processor with an RMID or COS. PLZA allows the
>>>> system to override this per-thread association for a thread that is
>>>> executing with CPL=0.
>>>> ------------------------------------------------------------------------
>>>>
>>>> The series introduces the feature in a way that supports the interface in
>>>> a generic manner to accomodate MPAM or other vendor specific implimentation.
>>>>
>>>> Below is the detailed requirements provided by Reinette:
>>>> https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
>>> Our discussion considered how resctrl could support PLZA in a generic way while
>>> also preparing to support MPAM's variants and how PLZA may evolve to have similar
>>> capabilities when considering the capabilities of its registers.
>>>
>>> This does not mean that your work needs to implement everything that was discussed.
>>> Instead, this work is expected to just support what PLZA is capable of today but
>>> do so in a way that the future enhancements could be added to.
>>>
>>> This series is quite difficult to follow since it appears to implement a full
>>> featured generic interface while PLZA cannot take advantage of it.
>>>
>>> Could you please simplify this work to focus on just enabling PLZA and only
>>> add interfaces needed to do so?
>> Sure. Will try. Lets continue the discussion.
>>>> Summary:
>>>> 1. Kernel-mode/PLZA controls and status should be exposed under the resctrl
>>>>      info directory:/sys/fs/resctrl/info/, not as a separate or arch-specific path.
>>>>
>>>> 2. Add two info files
>>>>
>>>>    a. kernel_mode
>>>>       Purpose: Control how resource allocation and monitoring apply in kernel mode
>>>>       (e.g. inherit from task vs global assign).
>>>>
>>>>       Read: List supported modes and show current one (e.g. with [brackets]).
>>>>       Write: Set current mode by name (e.g. inherit_ctrl_and_mon, global_assign_ctrl_assign_mon).
>>>>
>>>> b. kernel_mode_assignment
>>>>
>>>>      Purpose: When a “global assign” kernel mode is active, specify which resctrl group
>>>>      (CLOSID/RMID) is used for kernel work.
>>>>
>>>>      Read: Show the assigned group in a path-like form (e.g. //, ctrl1//, ctrl1/mon1/).
>>>>      Write: Assign or clear the group used for kernel mode (and optionally clear with an empty write).
>>>>
>>>> The patches are based on top of commit (v7.0.0-rc3)
>>>> 839e91ce3f41b (tip/master) Merge branch into tip/master: 'x86/tdx'
>>>> ------------------------------------------------------------------------
>>>>
>>>> Examples: kernel_mode and kernel_mode_assignment
>>>>
>>>> All paths below are under /sys/fs/resctrl/ (e.g. info/kernel_mode means
>>>> /sys/fs/resctrl/info/kernel_mode). Resctrl must be mounted and the platform
>>>> must support the relevant modes (e.g. AMD with PLZA).
>>>>
>>>> 1) kernel_mode — show and set the current kernel mode
>>>>
>>>>      Read supported modes and which one is active (current in brackets):
>>>>
>>>>        $ cat info/kernel_mode
>>>>        [inherit_ctrl_and_mon]
>>>>        global_assign_ctrl_inherit_mon
>>>>        global_assign_ctrl_assign_mon
>>>>
>>>>      Set the active mode (e.g. use one CLOSID+RMID for all kernel work):
>>>>
>>>>        $ echo "global_assign_ctrl_assign_mon" > info/kernel_mode
>>>>        $ cat info/kernel_mode
>>>>        inherit_ctrl_and_mon
>>>>        global_assign_ctrl_inherit_mon
>>>>        [global_assign_ctrl_assign_mon]
>>>>
>>>>      Mode meanings:
>>>>      - inherit_ctrl_and_mon: kernel uses same CLOSID/RMID as the current task (default).
>>>>      - global_assign_ctrl_inherit_mon: one CLOSID for all kernel work; RMID inherited from user.
>>>>      - global_assign_ctrl_assign_mon: one resource group (CLOSID+RMID) for all kernel work.
>>>>
>>>> 2) kernel_mode_assignment — show and set which group is used for kernel work
>>>>
>>>>      Only relevant when kernel_mode is not "inherit_ctrl_and_mon". Read the
>>> To help with future usages please connect visibility of this file with the mode in
>>> info/kernel_mode. This helps us to support future modes with other resctrl files, possible
>>> within each resource group.
>>> Specifically, kernel_mode_assignment is not visible to user space if mode is "inherit_ctrl_and_mon",
>>> while it is visible when mode is global_assign_ctrl_inherit_mon or global_assign_ctrl_assign_mon.
>> Sure. Will do.
>>
>>>>      currently assigned group (path format is "CTRL_MON/MON/"):
>>> The format depends on the mode, right? If the mode is "global_assign_ctrl_inherit_mon"
>>> then it should only contain a control group, alternatively, if the mode is
>>> "global_assign_ctrl_assign_mon" then it contains control and mon group. This gives
>>> resctrl future flexibility to change format for future modes.
>> This can be done both ways.  Whole purpose of these groups is to get CLOSID and RMID to enable PLZA. User can echo CTRL_MON or MON group to kernel_mode_assignment in any of the modes.  We can decide what needs to be updated in MSR (PQR_PLZA_ASSOC) based on what kernel mode is selected.
> The "both ways" are specific to one of the two active modes though.
> PLZA only needs the RMID when the mode is "global_assign_ctrl_assign_mon".
>
> Displaying and parsing monitor group when the mode is
> "global_assign_ctrl_inherit_mon" creates an inconsistent interface since the mode
> only uses a control group. The interface to user space should match the mode otherwise
> it becomes confusing.
Ok. That is fine. We can do that.
> ...
>
>
>>>>        Tony suggested using global variables to store the kernel mode
>>>>        CLOSID and RMID. However, the kernel mode CLOSID and RMID are
>>>>        coming from rdtgroup structure with the new interface. Accessing
>>>>        them requires holding the associated lock, which would make the
>>>>        context switch path unnecessarily expensive. So, dropped the idea.
>>>>        https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
>>>>        Let me know if there are other ways to optimize this.
>>> I do not see why the context switch path needs to be touched at all with this
>>> implementation. Since PLZA only supports global assignment does it not mean that resctrl
>>> only needs to update PQR_PLZA_ASSOC when user writes to info/kernel_mode and
>>> info/kernel_mode_assignment?
>> Each thread has an MSR to configure whether to associate privilege level zero execution with a separate COS and/or RMID, and the value of the COS and/or RMID.  PLZA may be enabled or disabled on a per-thread basis. However, the COS and RMID association and configuration must be the same for all threads in the QOS Domain.
> Based on previous comment in https://lore.kernel.org/lkml/abb049fa-3a3d-4601-9ae3-61eeb7fd8fcf@amd.com/
> and this implementation all fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_en must be the
> same for all CPUs on the system, not just per QoS domain. Could you please confirm?

Sorry for the confusion. It is "per QoS domain".

All the fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_enmust be set to the same value for all HW threads in the QOS domain for 
consistent operation (Per-QosDomain).

>
>> So, PQR_PLZA_ASSOC is a per thread MSR just like PQR_ASSOC.
>>
>> Privilege-Level Zero Association (PLZA) allows the user to specify a COS and/or RMID associated with execution in Privilege-Level Zero. When enabled on a HW thread, when that thread enters Privilige-Level Zero, transactions associated with that thread will be associated with the PLZA COS and/or RMID. Otherwise, the HW thread will be associated with the COS and RMID identified by  PQR_ASSOC.
>>
>> More below.
>>
>>> Consider some of the scenarios:
>>>
>>> resctrl mount with default state:
>>>
>>>      # cat info/kernel_mode
>>>      [inherit_ctrl_and_mon]
>>>      global_assign_ctrl_inherit_mon
>>>      global_assign_ctrl_assign_mon
>>>      # ls info/kernel_mode_assignment
>>>      ls: cannot access 'info/kernel_mode_assignment': No such file or directory
>>>
>>> enable global_assign_ctrl_assign_mon mode:
>>>      # echo "global_assign_ctrl_assign_mon" > info/kernel_mode
>>>
>>> Expectation here is that when user space sets this mode as above then resctrl would
>>> in turn program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
>>>      MSR_IA32_PQR_PLZA_ASSOC.rmid=0
>>>      MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
>>>      MSR_IA32_PQR_PLZA_ASSOC.closid=0
>>>      MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
>>>      MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
>>>
>>> I do not see why it is necessary to maintain any per-CPU or per-task state or needing
>>> to touch the context switch code. Since PLZA only supports global could it not
>>> just set MSR_IA32_PQR_PLZA_ASSOC on all online CPUs and be done with it?
>>> Only caveat is that if a CPU is offline then this setting needs to be stashed
>>> so that MSR_IA32_PQR_PLZA_ASSOC can be set when new CPU comes online.
>>>
>>> The way that rdtgroup_config_kmode() introduced in patch #11 assumes it is dealing
>>> with RDT_RESOURCE_L3 and traverses the resource domain list and resource group
>>> CPU mask seems unnecessary to me as well as error prone since the system may only
>>> have, for example, RDT_RESOURCE_MBA enabled or even just monitoring. Why not just set
>>> MSR_IA32_PQR_PLZA_ASSOC on all CPUs and be done?
>>>
>>> To continue the scenarios ...
>>>
>>> After user's setting above related files read:
>>>      # cat info/kernel_mode
>>>      inherit_ctrl_and_mon
>>>      global_assign_ctrl_inherit_mon
>>>      [global_assign_ctrl_assign_mon]
>>>      # cat info/kernel_mode_assignment
>>>      //
>>>
>>> Modify group used by global_assign_ctrl_assign_mon mode:
>>>      # echo 'ctrl1/mon1/' > info/kernel_mode_assignment
>>>
>>> Expectation here is that when user space sets this then resctrl would
>>> program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
>>>      MSR_IA32_PQR_PLZA_ASSOC.rmid=<rmid of mon1>
>>>      MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
>>>      MSR_IA32_PQR_PLZA_ASSOC.closid=<closid of ctrl1>
>>>      MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
>>>      MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
>>
>> This works correctly when PLZA associations are defined by per CPU. For example, lets assume that *ctrl1* is assigned *CLOSID 1*.
>>
>> In this scenario, every task in the system running on a any CPU will use the limits associated with *CLOSID 1* whenever it enters Privilege-Level Zero, because the CPU's *PQR_PLZA_ASSOC* register has PLZA enabled and CLOSID is 1.
>>
>> Now consider task-based association:
>>
>> We have two resctrl groups:
>>
>>   * *ctrl1 -> CLOSID 1 -> task1.plza = 1   : *User wants PLZA be enabled
>>     for this task.
>>   * *ctrl2 -> CLOSID 2 -> task2.plza = 0   : *User wants PLZA
>>     disabled for this task.
>>
>> Suppose *task1* is first scheduled on *CPU 0*. This behaves as expected: since CPU 0 's *PQR_PLZA_ASSOC* contains *CLOSID 1, plza_en =1*, task1 will use the limits from CLOSID 1 when it enters Privilege-Level Zero.
>>
>> However, if *task2* later runs on *CPU 0*, we expect it to use *CLOSID 2* in both user mode and kernel mode, because user has PLZA disabled for this task. But CPU 0 still has *CLOSID 1, **plza_en =1* in its PQR_PLZA_ASSOC register.
>>
>> As a result, task2 will incorrectly run with *CLOSID 1* when entering Privilege-Level Zero something we explicitly want to avoid.
>>
>> At that point, PLZA must be disabled on CPU 0 to prevent the unintended association. Hope this explanation makes the issue clear.
>>
> A couple of points:
> - Looks like we still need to come to agreement what is meant by "global" when it
>    comes to kernel mode.
>
>    In your description there is a "global" configuration, but the assignment is "per-task".
>    To me this sounds like a new and distinct kernel_mode from the "global" modes
>    considered so far. This seems to move to the "per_task" mode mentioned in but
>    the implementation does not take into account any of the earlier discussions
>    surrounding it:
>    https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
>
>    We only learned about one use case in https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@mail.gmail.com/
>    As I understand this use case requires PLZA globally enabled for all tasks. Thus
>    I consider task assignment to be "global" when in the "global_*" kernel modes.
>    If this is indeed a common use case then supporting only global configuration
>    but then requiring user space to manually assign all tasks afterwards sounds
>    cumbersome for user space and also detrimental to system performance with all
>    the churn to modify all the task_structs involved. The accompanying documentation
>    does not mention all this additional user space interactions required by user
>    space to use this implementation.
>
>    I find this implementation difficult and inefficient to use in the one use case
>    we know of. I would suggest that resctrl optimizes for the one known use case.
>
> - This implementation ignores discussion on how existing resctrl files should
>    not be repurposed.
>
>    This implementation allows user space to set a resource group in
>    kernel_mode_assignment with the consequence that this resource group's
>    "tasks" file changes behavior. I consider this a break of resctrl interface.
>    We did briefly consider per-task configuration/assignment in previous discussion
>    and the proposal was for it to use a new file (only when and if needed!).
>
> - Now a user is required to write the task id of every task that participates
>    in PLZA. Apart from the churn already mentioned this also breaks existing
>    usage since it is no longer possible for new tasks to be added to this
>    resource group. This creates an awkward interface where all tasks belonging
>    to a resource group inherits the allocations/monitoring for their user space
>    work and will get PLZA enabled whether user requested it or not while
>    tasks from other resource groups need to be explicitly enabled. This creates
>    an inconsistency when it comes to task assignment. The only way to "remove"
>    PLZA from such a task would be to assign it to another resource group which
>    may not have the user space allocations ... and once this is done the task
>    cannot be moved back.
>    There is no requirement that CLOSID/RMID should be dedicated to kernel work
>    but this implementation does so in an inconsistent way.
>
> - Apart from the same issues as with repurposing of tasks file, why should same
>    CPU allocation be used for kernel and user space?
>
Yes, I agree with your concerns. The goal here is to make the interface 
less disruptive while still addressing the different use cases.


      Background: Customers have identified an issue with the QoS
      Bandwidth Control feature: when a CLOS is aggressively throttled
      and execution transitions into kernel mode, kernel operations are
      also subject to the same aggressive throttling.

Privilege-Level Zero Association (PLZA) allows a user to specify a COS 
and/or RMID to be used during execution at Privilege Level Zero. When 
PLZA is enabled on a hardware thread, any execution that enters 
Privilege Level Zero will have its transactions associated with the PLZA 
COS and/or RMID. Otherwise, the thread continues to use the COS and RMID 
specified by |PQR_ASSOC|. In other words, the hardware provides a 
dedicated COS and/or RMID specifically for kernel-mode execution.

There are multiple ways this feature can be applied. For simplicity, the 
discussion below focuses only on CLOSID.


      1. Global PLZA enablement

PLZA can be configured as a global feature by setting 
|PQR_PLZA_ASSOC.closid = CLOSID| and |PQR_PLZA_ASSOC.plza_en = 1| on all 
threads in the system. A dedicated CLOSID is reserved for this purpose, 
and all CPU threads use its allocations whenever they enter Privilege 
Level Zero. This CLOSID does not need to be associated with any resctrl 
group. The user can explicitly enable or disable this feature. There is 
no context switch overhead but there is no flexibility with this approach.


      2. Group based PLZA allocation :  PLZA is managed via dedicated
      restctrl group. A separate resctrl group can be created
      specifically for PLZA, with a dedicated CLOSID used exclusively
      for kernel mode execution. This approach can be further divided
      into two association models:

i) CPU based association
CPUs are assigned to the PLZA group, and PLZA is enabled only on those 
CPUs. This effectively creates a dedicated PLZA group. MSRs 
(|PQR_PLZA_ASSOC)| are programmed only when the user changes CPU 
assignments. This approach requires no changes to the context switch 
code and introduces no additional context switch overhead.

ii) Task based association
Tasks are explicitly assigned by the user to the PLZA group. Tasks need 
to be updated when user adds a new task. Also, this requires updates 
during task scheduling so that the MSRs (|PQR_PLZA_ASSOC)| are 
programmed on each context switch, which introduces additional context 
switch overhead.

I tried to fit these requirements into  the interface files in 
/sys/fs/resctrl/info/.  I may have missed few things while trying to 
achieve it.  As usual, I am open for the discussion and recommendations.

Thanks,
Babu

Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
Posted by Reinette Chatre 6 days, 14 hours ago
Hi Babu,

On 3/30/26 11:46 AM, Babu Moger wrote:
> On 3/27/26 17:11, Reinette Chatre wrote:
>> On 3/26/26 10:12 AM, Babu Moger wrote:
>>> On 3/24/26 17:51, Reinette Chatre wrote:
>>>> On 3/12/26 1:36 PM, Babu Moger wrote:

>>>>>        Tony suggested using global variables to store the kernel mode
>>>>>        CLOSID and RMID. However, the kernel mode CLOSID and RMID are
>>>>>        coming from rdtgroup structure with the new interface. Accessing
>>>>>        them requires holding the associated lock, which would make the
>>>>>        context switch path unnecessarily expensive. So, dropped the idea.
>>>>>        https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
>>>>>        Let me know if there are other ways to optimize this.
>>>> I do not see why the context switch path needs to be touched at all with this
>>>> implementation. Since PLZA only supports global assignment does it not mean that resctrl
>>>> only needs to update PQR_PLZA_ASSOC when user writes to info/kernel_mode and
>>>> info/kernel_mode_assignment?
>>> Each thread has an MSR to configure whether to associate privilege level zero execution with a separate COS and/or RMID, and the value of the COS and/or RMID.  PLZA may be enabled or disabled on a per-thread basis. However, the COS and RMID association and configuration must be the same for all threads in the QOS Domain.
>> Based on previous comment in https://lore.kernel.org/lkml/abb049fa-3a3d-4601-9ae3-61eeb7fd8fcf@amd.com/
>> and this implementation all fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_en must be the
>> same for all CPUs on the system, not just per QoS domain. Could you please confirm?
> 
> Sorry for the confusion. It is "per QoS domain".
> 
> All the fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_enmust be set to the same value for all HW threads in the QOS domain for consistent operation (Per-QosDomain).

Thank you for clarifying. To build on this, what would be best way for resctrl to interpret this?
As I see it all values in PQR_PLZA_ASSOC apply to *all* resources yet (theoretically?) every resource
can have domains that span different CPUs. There thus seem to be a built in assumption of what a "domain"
means for PQR_PLZA_ASSOC so it sounds to me as though, instead of saying that "PQR_PLZA_ASSOC needs
to be the same in QoS domain" it may be more accurate to, for example, say that "PQR_PLZA_ASSOC has L3 scope"?

This seems to be what this implementation does since it hardcodes PQR_PLZA_ASSOC scope to the L3
resource but that creates dependency to the L3 resource that would make PLZA unusable if, for example,
the user boots with "rdt=!l3cat" while wanting to use PLZA to manage MBA allocations when in kernel?

...

> Yes, I agree with your concerns. The goal here is to make the interface less disruptive while still addressing the different use cases.

I consider changing resctrl behavior when values are written to existing resctrl files
to be disruptive. This is something we explicitly discussed during v1 as something to
be avoided so this implementation that overloads the tasks file again is unexpected.

>      Background: Customers have identified an issue with the QoS
>      Bandwidth Control feature: when a CLOS is aggressively throttled
>      and execution transitions into kernel mode, kernel operations are
>      also subject to the same aggressive throttling.
>
> > Privilege-Level Zero Association (PLZA) allows a user to specify a
> COS and/or RMID to be used during execution at Privilege Level Zero.
> When PLZA is enabled on a hardware thread, any execution that enters
> Privilege Level Zero will have its transactions associated with the
> PLZA COS and/or RMID. Otherwise, the thread continues to use the COS
> and RMID specified by |PQR_ASSOC|. In other words, the hardware
> provides a dedicated COS and/or RMID specifically for kernel-mode
> execution.
ack.

> 
> There are multiple ways this feature can be applied. For simplicity, the discussion below focuses only on CLOSID.
> 
> 
>      1. Global PLZA enablement
> 
> PLZA can be configured as a global feature by setting |PQR_PLZA_ASSOC.closid = CLOSID| and |PQR_PLZA_ASSOC.plza_en = 1| on all threads in the system. A dedicated CLOSID is reserved for this purpose,

Also discussed during v1 is that there is no need to dedicate a CLOSID for this purpose.
There could be an "unthrottled" CLOSID to which all high priority user space tasks as
well as all kernel work of all tasks are assigned.
If user space chooses to dedicate a CLOSID for kernel work then that should supported and
interface can allow that, but there is no need for resctrl to enforce this.

> and all CPU threads use its allocations whenever they enter Privilege Level Zero. This CLOSID does not need to be associated with any resctrl group.

The CLOSID has to be associated with a resource group to be able to manage its
resource allocations, no?

> The user can explicitly enable or disable this feature.
ack.

> There is no context switch overhead but there is no flexibility with this approach.

Flexibility is subjective. As I understand this supports the only use case we learned about so far:
https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@mail.gmail.com/

>      2. Group based PLZA allocation :  PLZA is managed via dedicated
>      restctrl group. A separate resctrl group can be created
>      specifically for PLZA, with a dedicated CLOSID used exclusively
>      for kernel mode execution. This approach can be further divided
>      into two association models:

So far this sounds like global allocation since both need a dedicated resource group.
Whether this group is dedicated to kernel work or shared between kernel and user space work
is up to the user. There is no motivation why CLOSID should ever be enforced to be
exclusive for kernel mode execution.

> 
> i) CPU based association
> CPUs are assigned to the PLZA group, and PLZA is enabled only on
> those CPUs. This effectively creates a dedicated PLZA group. MSRs (|
> PQR_PLZA_ASSOC)| are programmed only when the user changes CPU
> assignments. This approach requires no changes to the context switch
> code and introduces no additional context switch overhead.
> 
> ii) Task based association
> Tasks are explicitly assigned by the user to the PLZA group. Tasks
> need to be updated when user adds a new task. Also, this requires
> updates during task scheduling so that the MSRs (|PQR_PLZA_ASSOC)|
> are programmed on each context switch, which introduces additional
> context switch overhead.

As discussed during v1 any changes needed to support per task assignment would
need to be done with new files dedicated to this purpose. Do not overload the
existing resctrl tasks/cpus/cpus_list files.
 
> I tried to fit these requirements into  the interface files in /sys/
> fs/resctrl/info/.  I may have missed few things while trying to
> achieve it.  As usual, I am open for the discussion and
> recommendations.

Many of these items were already discussed as part of v1 so I think we may be
talking past each other here. I tried to highlight the relevant points raised
during v1 discussion that I thought there already was agreement on. 

The one new aspect is that I assumed this implementation will only be for
global configuration and assignment. It looks like you want to support both
global configuration and per-task assignment. In the original I did not consider
configuration and assignment to occur at different scope so we may need to come up
with new modes to distinguish. Consider the addition of two modes as below:

	# cat info/kernel_mode
	[inherit_ctrl_and_mon]
	global_assign_ctrl_inherit_mon_set_all
	global_assign_ctrl_assign_mon_set_all
	global_assign_ctrl_inherit_mon_set_individual
	global_assign_ctrl_assign_mon_set_individual

Above introduces a "set_all" and "set_individual" suffix to the original two
modes.

global_assign_ctrl_inherit_mon_set_all
global_assign_ctrl_assign_mon_set_all:

	Above are the original two modes but makes it clear that when this mode is
	activated _all_ tasks run with the assignment.

global_assign_ctrl_inherit_mon_set_individual
global_assign_ctrl_assign_mon_set_individual:

	Above are two new modes. In this mode user space also assigns a resource
	group globally but then needs to follow that up by activating every task
	separately to run with this assignment.
	One way in which this can be accomplished could be to have "kernel_mode_tasks",
	"kernel_mode_cpus", and "kernel_mode_cpus_list"	files become visible (or be
	created) in the resource group found in	info/kernel_mode_assignment. User
	space interacts with the new files to set which tasks and/or CPUs run with
	PLZA enabled.
	
Even so, as I understand global_assign_ctrl_inherit_mon_set_all and 
global_assign_ctrl_assign_mon_set_all addresses the only known use case. Do you know 
if there are use cases for global_assign_ctrl_inherit_mon_set_individual and
global_assign_ctrl_assign_mon_set_individual? The latter two adds significant
complexity to resctrl while I have not heard about any use case for it.

Reinette
Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
Posted by Babu Moger 14 hours ago
Hi Reinette,

Sorry for the late response. I was trying to get confirmation about the 
use case.

On 3/31/26 17:24, Reinette Chatre wrote:
> Hi Babu,
> 
> On 3/30/26 11:46 AM, Babu Moger wrote:
>> On 3/27/26 17:11, Reinette Chatre wrote:
>>> On 3/26/26 10:12 AM, Babu Moger wrote:
>>>> On 3/24/26 17:51, Reinette Chatre wrote:
>>>>> On 3/12/26 1:36 PM, Babu Moger wrote:
> 
>>>>>>         Tony suggested using global variables to store the kernel mode
>>>>>>         CLOSID and RMID. However, the kernel mode CLOSID and RMID are
>>>>>>         coming from rdtgroup structure with the new interface. Accessing
>>>>>>         them requires holding the associated lock, which would make the
>>>>>>         context switch path unnecessarily expensive. So, dropped the idea.
>>>>>>         https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
>>>>>>         Let me know if there are other ways to optimize this.
>>>>> I do not see why the context switch path needs to be touched at all with this
>>>>> implementation. Since PLZA only supports global assignment does it not mean that resctrl
>>>>> only needs to update PQR_PLZA_ASSOC when user writes to info/kernel_mode and
>>>>> info/kernel_mode_assignment?
>>>> Each thread has an MSR to configure whether to associate privilege level zero execution with a separate COS and/or RMID, and the value of the COS and/or RMID.  PLZA may be enabled or disabled on a per-thread basis. However, the COS and RMID association and configuration must be the same for all threads in the QOS Domain.
>>> Based on previous comment in https://lore.kernel.org/lkml/abb049fa-3a3d-4601-9ae3-61eeb7fd8fcf@amd.com/
>>> and this implementation all fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_en must be the
>>> same for all CPUs on the system, not just per QoS domain. Could you please confirm?
>>
>> Sorry for the confusion. It is "per QoS domain".
>>
>> All the fields of PQR_PLZA_ASSOC except PQR_PLZA_ASSOC.plza_enmust be set to the same value for all HW threads in the QOS domain for consistent operation (Per-QosDomain).
> 
> Thank you for clarifying. To build on this, what would be best way for resctrl to interpret this?
> As I see it all values in PQR_PLZA_ASSOC apply to *all* resources yet (theoretically?) every resource

Yes.  That is correct. PLZA applies to all the resources.

> can have domains that span different CPUs. There thus seem to be a built in assumption of what a "domain"
> means for PQR_PLZA_ASSOC so it sounds to me as though, instead of saying that "PQR_PLZA_ASSOC needs
> to be the same in QoS domain" it may be more accurate to, for example, say that "PQR_PLZA_ASSOC has L3 scope"?

Yes.

> 
> This seems to be what this implementation does since it hardcodes PQR_PLZA_ASSOC scope to the L3
> resource but that creates dependency to the L3 resource that would make PLZA unusable if, for example,
> the user boots with "rdt=!l3cat" while wanting to use PLZA to manage MBA allocations when in kernel?

Yes. that is correct. It should not be attached to one resource. We need 
to change it to global scope.

> 
> ...
> 
>> Yes, I agree with your concerns. The goal here is to make the interface less disruptive while still addressing the different use cases.
> 
> I consider changing resctrl behavior when values are written to existing resctrl files
> to be disruptive. This is something we explicitly discussed during v1 as something to
> be avoided so this implementation that overloads the tasks file again is unexpected.

Yes. Agree. If required we need to introduce new files (kmode_cpus, 
kmode_cpu_list or kmode_tasks) to handle these cases.

> 
>>       Background: Customers have identified an issue with the QoS
>>       Bandwidth Control feature: when a CLOS is aggressively throttled
>>       and execution transitions into kernel mode, kernel operations are
>>       also subject to the same aggressive throttling.
>>
>>> Privilege-Level Zero Association (PLZA) allows a user to specify a
>> COS and/or RMID to be used during execution at Privilege Level Zero.
>> When PLZA is enabled on a hardware thread, any execution that enters
>> Privilege Level Zero will have its transactions associated with the
>> PLZA COS and/or RMID. Otherwise, the thread continues to use the COS
>> and RMID specified by |PQR_ASSOC|. In other words, the hardware
>> provides a dedicated COS and/or RMID specifically for kernel-mode
>> execution.
> ack.
> 
>>
>> There are multiple ways this feature can be applied. For simplicity, the discussion below focuses only on CLOSID.
>>
>>
>>       1. Global PLZA enablement
>>
>> PLZA can be configured as a global feature by setting |PQR_PLZA_ASSOC.closid = CLOSID| and |PQR_PLZA_ASSOC.plza_en = 1| on all threads in the system. A dedicated CLOSID is reserved for this purpose,
> 
> Also discussed during v1 is that there is no need to dedicate a CLOSID for this purpose.
> There could be an "unthrottled" CLOSID to which all high priority user space tasks as
> well as all kernel work of all tasks are assigned.
> If user space chooses to dedicate a CLOSID for kernel work then that should supported and
> interface can allow that, but there is no need for resctrl to enforce this.
> 
>> and all CPU threads use its allocations whenever they enter Privilege Level Zero. This CLOSID does not need to be associated with any resctrl group.

I misspoke here.

> 
> The CLOSID has to be associated with a resource group to be able to manage its
> resource allocations, no?

Yes. We need to have resource group schemata to enforce the limits.

> 
>> The user can explicitly enable or disable this feature.
> ack.
> 
>> There is no context switch overhead but there is no flexibility with this approach.
> 
> Flexibility is subjective. As I understand this supports the only use case we learned about so far:
> https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@mail.gmail.com/
> 
>>       2. Group based PLZA allocation :  PLZA is managed via dedicated
>>       restctrl group. A separate resctrl group can be created
>>       specifically for PLZA, with a dedicated CLOSID used exclusively
>>       for kernel mode execution. This approach can be further divided
>>       into two association models:
> 
> So far this sounds like global allocation since both need a dedicated resource group.
> Whether this group is dedicated to kernel work or shared between kernel and user space work
> is up to the user. There is no motivation why CLOSID should ever be enforced to be
> exclusive for kernel mode execution.

Yes. That is fine.
> 
>>
>> i) CPU based association
>> CPUs are assigned to the PLZA group, and PLZA is enabled only on
>> those CPUs. This effectively creates a dedicated PLZA group. MSRs (|
>> PQR_PLZA_ASSOC)| are programmed only when the user changes CPU
>> assignments. This approach requires no changes to the context switch
>> code and introduces no additional context switch overhead.
>>
>> ii) Task based association
>> Tasks are explicitly assigned by the user to the PLZA group. Tasks
>> need to be updated when user adds a new task. Also, this requires
>> updates during task scheduling so that the MSRs (|PQR_PLZA_ASSOC)|
>> are programmed on each context switch, which introduces additional
>> context switch overhead.
> 
> As discussed during v1 any changes needed to support per task assignment would
> need to be done with new files dedicated to this purpose. Do not overload the
> existing resctrl tasks/cpus/cpus_list files.

Yes. Sure.

>   
>> I tried to fit these requirements into  the interface files in /sys/
>> fs/resctrl/info/.  I may have missed few things while trying to
>> achieve it.  As usual, I am open for the discussion and
>> recommendations.
> 
> Many of these items were already discussed as part of v1 so I think we may be
> talking past each other here. I tried to highlight the relevant points raised
> during v1 discussion that I thought there already was agreement on.
> 
> The one new aspect is that I assumed this implementation will only be for
> global configuration and assignment. It looks like you want to support both
> global configuration and per-task assignment. In the original I did not consider
> configuration and assignment to occur at different scope so we may need to come up
> with new modes to distinguish. Consider the addition of two modes as below:
> 
> 	# cat info/kernel_mode
> 	[inherit_ctrl_and_mon]
> 	global_assign_ctrl_inherit_mon_set_all
> 	global_assign_ctrl_assign_mon_set_all
> 	global_assign_ctrl_inherit_mon_set_individual
> 	global_assign_ctrl_assign_mon_set_individual
> 
> Above introduces a "set_all" and "set_individual" suffix to the original two
> modes.
> 
> global_assign_ctrl_inherit_mon_set_all
> global_assign_ctrl_assign_mon_set_all:
> 
> 	Above are the original two modes but makes it clear that when this mode is
> 	activated _all_ tasks run with the assignment.
> 
> global_assign_ctrl_inherit_mon_set_individual
> global_assign_ctrl_assign_mon_set_individual:
> 
> 	Above are two new modes. In this mode user space also assigns a resource
> 	group globally but then needs to follow that up by activating every task
> 	separately to run with this assignment.
> 	One way in which this can be accomplished could be to have "kernel_mode_tasks",
> 	"kernel_mode_cpus", and "kernel_mode_cpus_list"	files become visible (or be
> 	created) in the resource group found in	info/kernel_mode_assignment. User
> 	space interacts with the new files to set which tasks and/or CPUs run with
> 	PLZA enabled.
> 	
> Even so, as I understand global_assign_ctrl_inherit_mon_set_all and
> global_assign_ctrl_assign_mon_set_all addresses the only known use case. Do you know
> if there are use cases for global_assign_ctrl_inherit_mon_set_individual and
> global_assign_ctrl_assign_mon_set_individual? The latter two adds significant
> complexity to resctrl while I have not heard about any use case for it.
> 

Yes. I agree. The changes in context switch code is a concern.

You covered some of the cases I was thinking(xx_set_individual).

How about this idea?

I suggest splitting the PLZA into two distinct aspects:

1. How PLZA is applied within a resource group

2. How PLZA is monitored


Introduce a new file, "info/kmode_type", to describe how kmode applies 
in the system.

# cat info/kmode_type
[global] <- Kernel mode applies to the entire system (all CPUs/tasks)
   cpus   <- Kernel mode applies only to the CPUs in the group
   tasks  <- Kernel mode applies only to the tasks in the group

The "global" option is the default right now and it is current common 
use-case.

The "info/kmode_type -> cpus" option introduces new files "kmode_cpus" 
and "kmode_cpus_list" for users to apply kmode to specific set of CPUs. 
This lets users change the CPU set for PLZA. The PLZA MSR is updated 
when user changes the association to the file. No context switch code 
changes are needed. This will be dedicated group. The current resctrl 
group files, "cpus, cpus_list and tasks" will not be accessible in this 
mode. This option give some flexibility for the user without the context 
switch overhead.

The "info/kmode_type -> tasks" option introduces a new file, 
"kmode_tasks", for users to apply kmode to specific set of tasks. This 
requires context switch changes. This will be dedicated group. The 
current resctrl group files, "cpus, cpus_list and tasks" will not be 
accessible in this mode. We currently have no use case for this, so it 
will not be supported now.


Add a file, "info/kmode_monitor", to describe how kmode is monitored.

# cat info/kmode_monitor
[inherit_ctrl_and_mon] <- Kernel uses the same CLOSID/RMID as user. 
Default option for the "global"
assign_ctrl_inherit_mon <- One CLOSID for all kernel work; RMID 
inherited from user.
assign_ctrl_assign_mon <- One resource group (CLOSID+RMID) for all 
kernel work. Default option for "cpu" type.


Rename “kernel_mode_assignment” to “kmode_group” to assign the specific 
group to kmode. This file usage is same as before.

#cat info/kmode_groups (Renamed "kernel_mode_assignment")
//


Thoughts?

thanks
Babu

Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
Posted by Askar Safin 2 weeks ago
Please, remove me from CC list in future versions of this patchset

-- 
Askar Safin