[PATCH v3 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag

Babu Moger posted 10 patches 3 years, 7 months ago
There is a newer version of this series
[PATCH v3 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
Posted by Babu Moger 3 years, 7 months ago
Adds the new AMD feature X86_FEATURE_SMBA. With this feature, the QOS
enforcement policies can be applied to external slow memory connected
to the host. QOS enforcement is accomplished by assigning a Class Of
Service (COS) to a processor and specifying allocations or limits for
that COS for each resource to be allocated.

This feature is identified by the CPUID Function 8000_0020_EBX_x0.

CPUID Fn8000_0020_EBX_x0 AMD Bandwidth Enforcement Feature Identifiers (ECX=0)
Bits    Field Name      Description
2       L3SBE           L3 external slow memory bandwidth enforcement

Feature description is available in the specification, "AMD64 Technology Platform Quality
of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".

Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/kernel/cpu/scattered.c    |    1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 235dc85c91c3..1815435c9c88 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -304,6 +304,7 @@
 #define X86_FEATURE_UNRET		(11*32+15) /* "" AMD BTB untrain return */
 #define X86_FEATURE_USE_IBPB_FW		(11*32+16) /* "" Use IBPB during runtime firmware calls */
 #define X86_FEATURE_RSB_VMEXIT_LITE	(11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */
+#define X86_FEATURE_SMBA		(11*32+18) /* SLOW Memory Bandwidth Allocation */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
 #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI instructions */
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index fd44b54c90d5..885ecf46abb2 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -44,6 +44,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_MBA,		CPUID_EBX,  6, 0x80000008, 0 },
+	{ X86_FEATURE_SMBA,             CPUID_EBX,  2, 0x80000020, 0 },
 	{ X86_FEATURE_PERFMON_V2,	CPUID_EAX,  0, 0x80000022, 0 },
 	{ 0, 0, 0, 0, 0 }
 };

Re: [PATCH v3 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
Posted by Reinette Chatre 3 years, 7 months ago
Hi Babu,

On 8/22/2022 6:42 AM, Babu Moger wrote:
> Adds the new AMD feature X86_FEATURE_SMBA. With this feature, the QOS
> enforcement policies can be applied to external slow memory connected
> to the host. QOS enforcement is accomplished by assigning a Class Of
> Service (COS) to a processor and specifying allocations or limits for
> that COS for each resource to be allocated.
> 
> This feature is identified by the CPUID Function 8000_0020_EBX_x0.
> 
> CPUID Fn8000_0020_EBX_x0 AMD Bandwidth Enforcement Feature Identifiers (ECX=0)
> Bits    Field Name      Description
> 2       L3SBE           L3 external slow memory bandwidth enforcement
> 
> Feature description is available in the specification, "AMD64 Technology Platform Quality
> of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".
> 
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Reviewed-by: Ingo Molnar <mingo@kernel.org>
> ---

resctrl currently supports "memory bandwidth allocation" and this series adds
"slow memory bandwidth allocation". Could you please provide more detail about
what the difference is between "MBA" and "SMBA"? It is clear that the implementation
treats them as different resources, but both resources are associated with L3 cache
domains and (from what I understand) throttling always occurs at the CPU. Can both
types of memory resources thus be seen as downstream from L3 cache? How can
a user know what memory is considered when configuring MBA and what memory is
considered when configuring SMBA? Additionally, I do find the term "slow" to be
vague as a way to distinguish between different memory types. What is the
definition of "slow"? Would all "slow" memory on the system support SMBA?

Reinette
Re: [PATCH v3 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
Posted by Moger, Babu 3 years, 7 months ago
On 8/23/2022 5:47 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 8/22/2022 6:42 AM, Babu Moger wrote:
>> Adds the new AMD feature X86_FEATURE_SMBA. With this feature, the QOS
>> enforcement policies can be applied to external slow memory connected
>> to the host. QOS enforcement is accomplished by assigning a Class Of
>> Service (COS) to a processor and specifying allocations or limits for
>> that COS for each resource to be allocated.
>>
>> This feature is identified by the CPUID Function 8000_0020_EBX_x0.
>>
>> CPUID Fn8000_0020_EBX_x0 AMD Bandwidth Enforcement Feature Identifiers (ECX=0)
>> Bits    Field Name      Description
>> 2       L3SBE           L3 external slow memory bandwidth enforcement
>>
>> Feature description is available in the specification, "AMD64 Technology Platform Quality
>> of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".
>>
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fen%2Fsupport%2Ftech-docs%2Famd64-technology-platform-quality-service-extensions&amp;data=05%7C01%7Cbabu.moger%40amd.com%7C4385e95126b24de58aec08da85597c88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637968916632283680%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=dKSxIinxZGybtACbs3%2FVZr4zbeAvXYc%2FezVivq3xjx0%3D&amp;reserved=0
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D206537&amp;data=05%7C01%7Cbabu.moger%40amd.com%7C4385e95126b24de58aec08da85597c88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637968916632283680%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=f7MJmrwkBGxq8BuWjNY6Ze9NdzJc6NOkXxNjUZk5c4U%3D&amp;reserved=0
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> Reviewed-by: Ingo Molnar <mingo@kernel.org>
>> ---
> resctrl currently supports "memory bandwidth allocation" and this series adds
> "slow memory bandwidth allocation". Could you please provide more detail about
> what the difference is between "MBA" and "SMBA"? It is clear that the implementation
In this case the slow memory means memory attached to CXL device.
> treats them as different resources, but both resources are associated with L3 cache
> domains and (from what I understand) throttling always occurs at the CPU. Can both
> types of memory resources thus be seen as downstream from L3 cache? How can
Yes. that is correct. They are seen as downstream from L3.
> a user know what memory is considered when configuring MBA and what memory is
> considered when configuring SMBA? Additionally, I do find the term "slow" to be

This memory completely transparent to OS with little bit higher latency 
that regular main memory.

Yes. I know slow word is bit vague. I am not an expert of CXL. But i see 
that word slow is being used to refer the CXL memory to differentiate it 
from regular memory.

> vague as a way to distinguish between different memory types. What is the
> definition of "slow"? Would all "slow" memory on the system support SMBA?

Yes. All the slow memory in the system can support SMBA.

Thanks

Babu

>
> Reinette
Re: [PATCH v3 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
Posted by Reinette Chatre 3 years, 7 months ago
Hi Babu,

On 8/25/2022 3:42 PM, Moger, Babu wrote:
> 
> On 8/23/2022 5:47 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 8/22/2022 6:42 AM, Babu Moger wrote:
>>> Adds the new AMD feature X86_FEATURE_SMBA. With this feature, the QOS
>>> enforcement policies can be applied to external slow memory connected
>>> to the host. QOS enforcement is accomplished by assigning a Class Of
>>> Service (COS) to a processor and specifying allocations or limits for
>>> that COS for each resource to be allocated.
>>>
>>> This feature is identified by the CPUID Function 8000_0020_EBX_x0.
>>>
>>> CPUID Fn8000_0020_EBX_x0 AMD Bandwidth Enforcement Feature Identifiers (ECX=0)
>>> Bits    Field Name      Description
>>> 2       L3SBE           L3 external slow memory bandwidth enforcement
>>>
>>> Feature description is available in the specification, "AMD64 Technology Platform Quality
>>> of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".
>>>

(snip modified links)

>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> Reviewed-by: Ingo Molnar <mingo@kernel.org>
>>> ---
>> resctrl currently supports "memory bandwidth allocation" and this series adds
>> "slow memory bandwidth allocation". Could you please provide more detail about
>> what the difference is between "MBA" and "SMBA"? It is clear that the implementation

> In this case the slow memory means memory attached to CXL device.

When you say "in this case", is there another case?

Should "Slow Memory Bandwidth Allocation" thus be considered to be "CXL.mem
Memory Bandwidth Allocation"? Why not call it "CXL(.mem?) Memory Bandwith
Allocation"?

I am not familiar with CXL so please correct me where I am
wrong. From what I understand CXL.mem is a protocol and devices that implement
it can have different memory types ... some faster than others. So, even if
SMBA supports "CXL.mem" devices, could a system have multiple CXL.mem devices,
some faster than others? Would all be configured the same with SMBA (they
would all be classified as "slow" and throttled the same)?

>> treats them as different resources, but both resources are associated with L3 cache
>> domains and (from what I understand) throttling always occurs at the CPU. Can both
>> types of memory resources thus be seen as downstream from L3 cache? How can

> Yes. that is correct. They are seen as downstream from L3.


>> a user know what memory is considered when configuring MBA and what memory is
>> considered when configuring SMBA? Additionally, I do find the term "slow" to be
> 
> This memory completely transparent to OS with little bit higher latency that regular main memory.

I do not think these devices are invisible to the OS though (after
reading Documentation/driver-api/cxl/memory-devices.rst and
Documentation/ABI/testing/sysfs-class-cxl).

Is there not a way to provide some more clarity to users on what
would be throttled? 

> 
> Yes. I know slow word is bit vague. I am not an expert of CXL. But i see that word slow is being used to refer the CXL memory to differentiate it from regular memory.

What is very vague to me is how a user is intended to use this feature.
Would the "SMBA" resource be available only when CXL.mem devices are present
on the system? Since this is a CPU feature it is unclear to me whether
presence of CXL.mem devices would be known at the time "SMBA" is enumerated.
Could the "SMBA" resource thus exist without memory to throttle?

>> vague as a way to distinguish between different memory types. What is the
>> definition of "slow"? Would all "slow" memory on the system support SMBA?
> 
> Yes. All the slow memory in the system can support SMBA.
> 

How does a user know which memory on the system is "slow memory"?

It remains unclear to me how a user is intended to use this feature.

How will a user know which devices/memory (if any) are being
throttled by "SMBA"?

Reinette
Re: [PATCH v3 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
Posted by Babu Moger 3 years, 7 months ago
Hi Reinette,
   Some reason this thread did not land in my mailbox. Replying using git sendmail to the thread 

>(snip modified links)

Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

>When you say "in this case", is there another case?

There is no other interface. It is only CXL memory device.

>
>Should "Slow Memory Bandwidth Allocation" thus be considered to be "CXL.mem
>Memory Bandwidth Allocation"? Why not call it "CXL(.mem?) Memory Bandwith
>Allocation"?

Checked with our team here. The currently only supported slow memory is CXL.mem
device. As for the naming, the "slow" memory landscape is still evolving.
While CXL.mem is the only known type supported right now. The specs says
"Slow Memory Bandwidth Allocation". So, we would prefer to keep it that way.

>
>I am not familiar with CXL so please correct me where I am
>wrong. From what I understand CXL.mem is a protocol and devices that implement
>it can have different memory types ... some faster than others. So, even if
>SMBA supports "CXL.mem" devices, could a system have multiple CXL.mem devices,
>some faster than others? Would all be configured the same with SMBA (they
>would all be classified as "slow" and throttled the same)?

I have not tested the multiple devices with different memory speeds here.
But checking with team here says it should just work the same way. It appears
that the throttling logic groups all the slow sources together and applies
the limit on them as a whole.

>
>
>
>I do not think these devices are invisible to the OS though (after
>reading Documentation/driver-api/cxl/memory-devices.rst and
>Documentation/ABI/testing/sysfs-class-cxl).
>
>Is there not a way to provide some more clarity to users on what
>would be throttled? 
>
>
>How does a user know which memory on the system is "slow memory"?
>
>It remains unclear to me how a user is intended to use this feature.
>
>How will a user know which devices/memory (if any) are being
>throttled by "SMBA"?
>
This is a new technology. I am still learning. 

Currently, I have tested with CXL 1.1 type of device. CXL 1.1 uses a simple
topology structure of direct attachment between host (such as a CPU or GPU)
and CXL device.

#numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
node 0 size: 63678 MB
node 0 free: 59542 MB
node 1 cpus:             (CPU list is emply. Node 1 have CXL memory)
node 1 size: 16122 MB    (There is 16GB CXL memory) 
node 1 free: 15627 MB    
node distances:
node   0   1
  0:  10  50
  1:  50  10

The cpu-cxl node distance is greater than cpu-to-cpu distances.

We can also verify using lsmem
 
#lsmem --output RANGE,SIZE,STATE,NODE,ZONES,BLOCK
RANGE                                 SIZE  STATE NODE  ZONES BLOCK
0x0000000000000000-0x000000007fffffff   2G online    0   None     0
0x0000000080000000-0x00000000ffffffff   2G online    0  DMA32     1
0x0000000100000000-0x0000000fffffffff  60G online    0 Normal  2-31
0x0000001000000000-0x000000107fffffff   2G online    0   None    32
0x0000001080000000-0x000000147fffffff  16G online    1 Normal 33-40

Memory block size:         2G
Total online memory:      82G
Total offline memory:      0B


We can also verify using ACPI SRAT table and memory maps.

Thanks
Babu
Re: [PATCH v3 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
Posted by Reinette Chatre 3 years, 7 months ago
Hi Babu,

On 8/29/2022 4:25 PM, Babu Moger wrote:
> Hi Reinette,
>    Some reason this thread did not land in my mailbox. Replying using git sendmail to the thread 
> 
>> (snip modified links)
> 
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> 
>> When you say "in this case", is there another case?
> 
> There is no other interface. It is only CXL memory device.
> 
>>
>> Should "Slow Memory Bandwidth Allocation" thus be considered to be "CXL.mem
>> Memory Bandwidth Allocation"? Why not call it "CXL(.mem?) Memory Bandwith
>> Allocation"?
> 
> Checked with our team here. The currently only supported slow memory is CXL.mem
> device. As for the naming, the "slow" memory landscape is still evolving.
> While CXL.mem is the only known type supported right now. The specs says
> "Slow Memory Bandwidth Allocation". So, we would prefer to keep it that way.

If you prefer to keep "Slow Memory Bandwidth Allocation" then please also
provide clear information to the user on what is managed via "Memory Bandwidth
Allocation" and what is managed via "Slow Memory Bandwidth Allocation". This
could be in the documentation.

>> I am not familiar with CXL so please correct me where I am
>> wrong. From what I understand CXL.mem is a protocol and devices that implement
>> it can have different memory types ... some faster than others. So, even if
>> SMBA supports "CXL.mem" devices, could a system have multiple CXL.mem devices,
>> some faster than others? Would all be configured the same with SMBA (they
>> would all be classified as "slow" and throttled the same)?
> 
> I have not tested the multiple devices with different memory speeds here.
> But checking with team here says it should just work the same way. It appears
> that the throttling logic groups all the slow sources together and applies
> the limit on them as a whole.

"the throttling logic groups all the slow sources together and applies
the limit on them as a whole.". This is valuable content for
the documentation about this feature. Could the changes to
Documentation/x86/resctrl.rst be updated to include a paragraph
describing SMBA and what is (or is not) considered a "slow resource"? 

>> I do not think these devices are invisible to the OS though (after
>> reading Documentation/driver-api/cxl/memory-devices.rst and
>> Documentation/ABI/testing/sysfs-class-cxl).
>>
>> Is there not a way to provide some more clarity to users on what
>> would be throttled? 
>>

I repeat the question you snipped from my email (please don't do that). Could
you please answer it?:
Would the "SMBA" resource be available only when CXL.mem devices are present
on the system? Since this is a CPU feature it is unclear to me whether
presence of CXL.mem devices would be known at the time "SMBA" is enumerated.
Could the "SMBA" resource thus exist without memory to throttle?

>> How does a user know which memory on the system is "slow memory"?
>>
>> It remains unclear to me how a user is intended to use this feature.
>>
>> How will a user know which devices/memory (if any) are being
>> throttled by "SMBA"?
>>
> This is a new technology. I am still learning. 
> 
> Currently, I have tested with CXL 1.1 type of device. CXL 1.1 uses a simple
> topology structure of direct attachment between host (such as a CPU or GPU)
> and CXL device.
> 
> #numactl -H
> available: 2 nodes (0-1)
> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
> node 0 size: 63678 MB
> node 0 free: 59542 MB
> node 1 cpus:             (CPU list is emply. Node 1 have CXL memory)
> node 1 size: 16122 MB    (There is 16GB CXL memory) 
> node 1 free: 15627 MB    
> node distances:
> node   0   1
>   0:  10  50
>   1:  50  10
> 
> The cpu-cxl node distance is greater than cpu-to-cpu distances.
> 
> We can also verify using lsmem
>  
> #lsmem --output RANGE,SIZE,STATE,NODE,ZONES,BLOCK
> RANGE                                 SIZE  STATE NODE  ZONES BLOCK
> 0x0000000000000000-0x000000007fffffff   2G online    0   None     0
> 0x0000000080000000-0x00000000ffffffff   2G online    0  DMA32     1
> 0x0000000100000000-0x0000000fffffffff  60G online    0 Normal  2-31
> 0x0000001000000000-0x000000107fffffff   2G online    0   None    32
> 0x0000001080000000-0x000000147fffffff  16G online    1 Normal 33-40
> 
> Memory block size:         2G
> Total online memory:      82G
> Total offline memory:      0B
> 
> 
> We can also verify using ACPI SRAT table and memory maps.

I think that adding (in general terms) that "SMBA throttles CXL.mem
devices" to Documentation/x86/resctrl.rst may be sufficient for
a user to understand what will be throttled without needing to go into
details about CXL device discovery. 

Reinette
RE: [PATCH v3 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
Posted by Moger, Babu 3 years, 7 months ago
[AMD Official Use Only - General]

Hi Reinette,

> -----Original Message-----
> From: Reinette Chatre <reinette.chatre@intel.com>
> Sent: Tuesday, August 30, 2022 11:40 AM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: bagasdotme@gmail.com; bp@alien8.de; corbet@lwn.net;
> dave.hansen@linux.intel.com; eranian@google.com; fenghua.yu@intel.com;
> hpa@zytor.com; linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org;
> mingo@redhat.com; tglx@linutronix.de; tony.luck@intel.com; x86@kernel.org
> Subject: Re: [PATCH v3 02/10] x86/cpufeatures: Add Slow Memory Bandwidth
> Allocation feature flag
> 
> Hi Babu,
> 
> On 8/29/2022 4:25 PM, Babu Moger wrote:
> > Hi Reinette,
> >    Some reason this thread did not land in my mailbox. Replying using
> > git sendmail to the thread
> >
> >> (snip modified links)
> >
> > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > amd.com%2Fen%2Fsupport%2Ftech-docs%2Famd64-technology-platform-
> quality
> > -service-
> extensions&amp;data=05%7C01%7Cbabu.moger%40amd.com%7C5e1d3f7a
> >
> a30749a3841e08da8aa69bd0%7C3dd8961fe4884e608e11a82d994e183d%7C0
> %7C0%7C
> >
> 637974796714276452%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjo
> >
> iV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdat
> a=yGIo
> > q%2Fp9xD1i6IfrkPEUj8sg9Xz08r0jrNTvGK7khko%3D&amp;reserved=0
> > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz
> >
> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D206537&amp;data=05%7C01%7Cbab
> u.m
> >
> oger%40amd.com%7C5e1d3f7aa30749a3841e08da8aa69bd0%7C3dd8961fe48
> 84e608e
> >
> 11a82d994e183d%7C0%7C0%7C637974796714276452%7CUnknown%7CTWFpb
> GZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7
> C300
> >
> 0%7C%7C%7C&amp;sdata=qu1cxHp6nCdEFJbJv5QDD0tAHHaV4tJ63NKC9fIiIx0%
> 3D&am
> > p;reserved=0
> >
> >> When you say "in this case", is there another case?
> >
> > There is no other interface. It is only CXL memory device.
> >
> >>
> >> Should "Slow Memory Bandwidth Allocation" thus be considered to be
> >> "CXL.mem Memory Bandwidth Allocation"? Why not call it "CXL(.mem?)
> >> Memory Bandwith Allocation"?
> >
> > Checked with our team here. The currently only supported slow memory
> > is CXL.mem device. As for the naming, the "slow" memory landscape is still
> evolving.
> > While CXL.mem is the only known type supported right now. The specs
> > says "Slow Memory Bandwidth Allocation". So, we would prefer to keep it
> that way.
> 
> If you prefer to keep "Slow Memory Bandwidth Allocation" then please also
> provide clear information to the user on what is managed via "Memory
> Bandwidth Allocation" and what is managed via "Slow Memory Bandwidth
> Allocation". This could be in the documentation.

Sure.
> 
> >> I am not familiar with CXL so please correct me where I am wrong.
> >> From what I understand CXL.mem is a protocol and devices that
> >> implement it can have different memory types ... some faster than
> >> others. So, even if SMBA supports "CXL.mem" devices, could a system
> >> have multiple CXL.mem devices, some faster than others? Would all be
> >> configured the same with SMBA (they would all be classified as "slow" and
> throttled the same)?
> >
> > I have not tested the multiple devices with different memory speeds here.
> > But checking with team here says it should just work the same way. It
> > appears that the throttling logic groups all the slow sources together
> > and applies the limit on them as a whole.
> 
> "the throttling logic groups all the slow sources together and applies the limit
> on them as a whole.". This is valuable content for the documentation about this
> feature. Could the changes to Documentation/x86/resctrl.rst be updated to
> include a paragraph describing SMBA and what is (or is not) considered a "slow
> resource"?

Sure.
> 
> >> I do not think these devices are invisible to the OS though (after
> >> reading Documentation/driver-api/cxl/memory-devices.rst and
> >> Documentation/ABI/testing/sysfs-class-cxl).
> >>
> >> Is there not a way to provide some more clarity to users on what
> >> would be throttled?
> >>
> 
> I repeat the question you snipped from my email (please don't do that). Could
Sorry.. Not intentional.

> you please answer it?:
> Would the "SMBA" resource be available only when CXL.mem devices are
> present on the system? Since this is a CPU feature it is unclear to me whether
> presence of CXL.mem devices would be known at the time "SMBA" is
> enumerated.
> Could the "SMBA" resource thus exist without memory to throttle?

Yes.  The presence of the SMBA feature(with CXL.mem) is independent of whether slow memory is actually present in the system.  If there is no slow memory, then setting a SMBA limit will have no impact on the performance of the system.
> 
> >> How does a user know which memory on the system is "slow memory"?
> >>
> >> It remains unclear to me how a user is intended to use this feature.
> >>
> >> How will a user know which devices/memory (if any) are being
> >> throttled by "SMBA"?
> >>
> > This is a new technology. I am still learning.
> >
> > Currently, I have tested with CXL 1.1 type of device. CXL 1.1 uses a
> > simple topology structure of direct attachment between host (such as a
> > CPU or GPU) and CXL device.
> >
> > #numactl -H
> > available: 2 nodes (0-1)
> > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 node 0 size:
> > 63678 MB node 0 free: 59542 MB
> > node 1 cpus:             (CPU list is emply. Node 1 have CXL memory)
> > node 1 size: 16122 MB    (There is 16GB CXL memory)
> > node 1 free: 15627 MB
> > node distances:
> > node   0   1
> >   0:  10  50
> >   1:  50  10
> >
> > The cpu-cxl node distance is greater than cpu-to-cpu distances.
> >
> > We can also verify using lsmem
> >
> > #lsmem --output RANGE,SIZE,STATE,NODE,ZONES,BLOCK
> > RANGE                                 SIZE  STATE NODE  ZONES BLOCK
> > 0x0000000000000000-0x000000007fffffff   2G online    0   None     0
> > 0x0000000080000000-0x00000000ffffffff   2G online    0  DMA32     1
> > 0x0000000100000000-0x0000000fffffffff  60G online    0 Normal  2-31
> > 0x0000001000000000-0x000000107fffffff   2G online    0   None    32
> > 0x0000001080000000-0x000000147fffffff  16G online    1 Normal 33-40
> >
> > Memory block size:         2G
> > Total online memory:      82G
> > Total offline memory:      0B
> >
> >
> > We can also verify using ACPI SRAT table and memory maps.
> 
> I think that adding (in general terms) that "SMBA throttles CXL.mem devices" to
> Documentation/x86/resctrl.rst may be sufficient for a user to understand what
> will be throttled without needing to go into details about CXL device discovery.

Sure.
Thanks
Babu