Adds the new AMD feature X86_FEATURE_SMBA. With this feature, the QOS
enforcement policies can be applied to external slow memory connected
to the host. QOS enforcement is accomplished by assigning a Class Of
Service (COS) to a processor and specifying allocations or limits for
that COS for each resource to be allocated.
This feature is identified by the CPUID Function 8000_0020_EBX_x0.
CPUID Fn8000_0020_EBX_x0 AMD Bandwidth Enforcement Feature Identifiers (ECX=0)
Bits Field Name Description
2 L3SBE L3 external slow memory bandwidth enforcement
Currently, CXL.memory is the only supported "slow" memory device. With the
support of SMBA feature the hardware enables bandwidth allocation on the
slow memory devices. If there are multiple slow memory devices in the system,
then the throttling logic groups all the slow sources together and applies
the limit on them as a whole.
The presence of the SMBA feature(with CXL.memory) is independent of whether
slow memory device is actually present in the system. If there is no slow
memory in the system, then setting a SMBA limit will have no impact on the
performance of the system.
Presence of CXL memory can be identified by numactl command.
$numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
node 0 size: 63678 MB node 0 free: 59542 MB
node 1 cpus:
node 1 size: 16122 MB
node 1 free: 15627 MB
node distances:
node 0 1
0: 10 50
1: 50 10
CPU list for CXL memory will be emply. The cpu-cxl node distance is greater
than cpu-to-cpu distances. Node 1 has the CXL memory in this case. CXL
memory can also be identified using ACPI SRAT table and memory maps.
Feature description is available in the specification, "AMD64 Technology Platform Quality
of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".
Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/scattered.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 235dc85c91c3..1815435c9c88 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -304,6 +304,7 @@
#define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
#define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */
#define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */
+#define X86_FEATURE_SMBA (11*32+18) /* SLOW Memory Bandwidth Allocation */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index fd44b54c90d5..885ecf46abb2 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -44,6 +44,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
{ X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
{ X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
+ { X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
{ X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
{ 0, 0, 0, 0, 0 }
};
Hi Babu,
On 9/7/2022 11:00 AM, Babu Moger wrote:
> Adds the new AMD feature X86_FEATURE_SMBA. With this feature, the QOS
Adds -> Add
> enforcement policies can be applied to external slow memory connected
> to the host. QOS enforcement is accomplished by assigning a Class Of
> Service (COS) to a processor and specifying allocations or limits for
> that COS for each resource to be allocated.
>
> This feature is identified by the CPUID Function 8000_0020_EBX_x0.
>
> CPUID Fn8000_0020_EBX_x0 AMD Bandwidth Enforcement Feature Identifiers (ECX=0)
> Bits Field Name Description
> 2 L3SBE L3 external slow memory bandwidth enforcement
>
> Currently, CXL.memory is the only supported "slow" memory device. With the
> support of SMBA feature the hardware enables bandwidth allocation on the
> slow memory devices. If there are multiple slow memory devices in the system,
> then the throttling logic groups all the slow sources together and applies
> the limit on them as a whole.
The above is a useful addition (made in patch 13/13) to the documentation ...
>
> The presence of the SMBA feature(with CXL.memory) is independent of whether
> slow memory device is actually present in the system. If there is no slow
> memory in the system, then setting a SMBA limit will have no impact on the
> performance of the system.
... could the above snippet also please be added to the documentation?
>
> Presence of CXL memory can be identified by numactl command.
>
> $numactl -H
> available: 2 nodes (0-1)
> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
> node 0 size: 63678 MB node 0 free: 59542 MB
> node 1 cpus:
> node 1 size: 16122 MB
> node 1 free: 15627 MB
> node distances:
> node 0 1
> 0: 10 50
> 1: 50 10
>
> CPU list for CXL memory will be emply. The cpu-cxl node distance is greater
emply -> empty?
> than cpu-to-cpu distances. Node 1 has the CXL memory in this case. CXL
> memory can also be identified using ACPI SRAT table and memory maps.
>
> Feature description is available in the specification, "AMD64 Technology Platform Quality
> of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".
Please shorten these lines to the recommended 75 chars per line.
>
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/kernel/cpu/scattered.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 235dc85c91c3..1815435c9c88 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -304,6 +304,7 @@
> #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
> #define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */
> #define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */
> +#define X86_FEATURE_SMBA (11*32+18) /* SLOW Memory Bandwidth Allocation */
Why is "SLOW" in all caps?
>
> /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
> #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
> index fd44b54c90d5..885ecf46abb2 100644
> --- a/arch/x86/kernel/cpu/scattered.c
> +++ b/arch/x86/kernel/cpu/scattered.c
> @@ -44,6 +44,7 @@ static const struct cpuid_bit cpuid_bits[] = {
> { X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
> { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
> { X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
> + { X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
> { X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
> { 0, 0, 0, 0, 0 }
> };
>
>
Could you please follow the coding style (wrt tabs vs. spaces) of the area you
are contributing to here? Please do so in all patches in this series.
Reinette
[AMD Official Use Only - General]
Hi Reinette,
> -----Original Message-----
> From: Reinette Chatre <reinette.chatre@intel.com>
> Sent: Friday, September 16, 2022 10:54 AM
> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net;
> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de
> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org;
> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org;
> quic_neeraju@quicinc.com; rdunlap@infradead.org;
> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com;
> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com;
> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com;
> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan
> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com;
> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org;
> bagasdotme@gmail.com; eranian@google.com
> Subject: Re: [PATCH v4 03/13] x86/cpufeatures: Add Slow Memory Bandwidth
> Allocation feature flag
>
> Hi Babu,
>
> On 9/7/2022 11:00 AM, Babu Moger wrote:
> > Adds the new AMD feature X86_FEATURE_SMBA. With this feature, the QOS
>
> Adds -> Add
Sure
>
> > enforcement policies can be applied to external slow memory connected
> > to the host. QOS enforcement is accomplished by assigning a Class Of
> > Service (COS) to a processor and specifying allocations or limits for
> > that COS for each resource to be allocated.
> >
> > This feature is identified by the CPUID Function 8000_0020_EBX_x0.
> >
> > CPUID Fn8000_0020_EBX_x0 AMD Bandwidth Enforcement Feature
> Identifiers (ECX=0)
> > Bits Field Name Description
> > 2 L3SBE L3 external slow memory bandwidth enforcement
> >
> > Currently, CXL.memory is the only supported "slow" memory device. With
> > the support of SMBA feature the hardware enables bandwidth allocation
> > on the slow memory devices. If there are multiple slow memory devices
> > in the system, then the throttling logic groups all the slow sources
> > together and applies the limit on them as a whole.
>
> The above is a useful addition (made in patch 13/13) to the documentation ...
>
> >
> > The presence of the SMBA feature(with CXL.memory) is independent of
> > whether slow memory device is actually present in the system. If there
> > is no slow memory in the system, then setting a SMBA limit will have
> > no impact on the performance of the system.
>
> ... could the above snippet also please be added to the documentation?
>
Ok Sure.
> >
> > Presence of CXL memory can be identified by numactl command.
> >
> > $numactl -H
> > available: 2 nodes (0-1)
> > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 node 0 size:
> > 63678 MB node 0 free: 59542 MB node 1 cpus:
> > node 1 size: 16122 MB
> > node 1 free: 15627 MB
> > node distances:
> > node 0 1
> > 0: 10 50
> > 1: 50 10
> >
> > CPU list for CXL memory will be emply. The cpu-cxl node distance is
> > greater
>
> emply -> empty?
ok
>
> > than cpu-to-cpu distances. Node 1 has the CXL memory in this case. CXL
> > memory can also be identified using ACPI SRAT table and memory maps.
> >
> > Feature description is available in the specification, "AMD64
> > Technology Platform Quality of Service Extensions, Revision: 1.03 Publication
> # 56375 Revision: 1.03 Issue Date: February 2022".
>
> Please shorten these lines to the recommended 75 chars per line.
Sure
>
> >
> > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > amd.com%2Fen%2Fsupport%2Ftech-docs%2Famd64-technology-platform-
> quality
> > -service-
> extensions&data=05%7C01%7Cbabu.moger%40amd.com%7C60553f32
> >
> e9ab4ed1219c08da97fbc048%7C3dd8961fe4884e608e11a82d994e183d%7C0%
> 7C0%7C
> >
> 637989404770663129%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjo
> >
> iV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdat
> a=WY4H
> > EzWHdMpMUUR%2FBnBupwdHuQ6O2RfdrfcGx4TkXfI%3D&reserved=0
> > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz
> >
> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D206537&data=05%7C01%7Cbab
> u.m
> >
> oger%40amd.com%7C60553f32e9ab4ed1219c08da97fbc048%7C3dd8961fe488
> 4e608e
> >
> 11a82d994e183d%7C0%7C0%7C637989404770663129%7CUnknown%7CTWFpb
> GZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7
> C300
> >
> 0%7C%7C%7C&sdata=7e7sGH0iaEW8mlst5mK3fn9wy%2FYRhDU%2BbBm
> PWzSrGL4%3
> > D&reserved=0
> > Signed-off-by: Babu Moger <babu.moger@amd.com>
> > ---
> > arch/x86/include/asm/cpufeatures.h | 1 +
> > arch/x86/kernel/cpu/scattered.c | 1 +
> > 2 files changed, 2 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/cpufeatures.h
> > b/arch/x86/include/asm/cpufeatures.h
> > index 235dc85c91c3..1815435c9c88 100644
> > --- a/arch/x86/include/asm/cpufeatures.h
> > +++ b/arch/x86/include/asm/cpufeatures.h
> > @@ -304,6 +304,7 @@
> > #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB
> untrain return */
> > #define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB
> during runtime firmware calls */
> > #define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on
> VM exit when EIBRS is enabled */
> > +#define X86_FEATURE_SMBA (11*32+18) /* SLOW Memory
> Bandwidth Allocation */
>
> Why is "SLOW" in all caps?
Will correct it.
>
> >
> > /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
> > #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI
> instructions */
> > diff --git a/arch/x86/kernel/cpu/scattered.c
> > b/arch/x86/kernel/cpu/scattered.c index fd44b54c90d5..885ecf46abb2
> > 100644
> > --- a/arch/x86/kernel/cpu/scattered.c
> > +++ b/arch/x86/kernel/cpu/scattered.c
> > @@ -44,6 +44,7 @@ static const struct cpuid_bit cpuid_bits[] = {
> > { X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
> > { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
> > { X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
> > + { X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
> > { X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
> > { 0, 0, 0, 0, 0 }
> > };
> >
> >
>
> Could you please follow the coding style (wrt tabs vs. spaces) of the area you
> are contributing to here? Please do so in all patches in this series.
Sure. Thanks
Babu Moger
© 2016 - 2026 Red Hat, Inc.