AMD Milan (Zen3) CPUs have an LFENCE Always Serializing CPUID bit in
leaf 80000021.eax. Previous AMD versions used to have a user settable
bit in DE_CFG MSR to select whether LFENCE was dispatch serializing,
which Xen always attempts to set. The forcefully always on setting is
due to the addition of SEV-SNP so that a VMM cannot break the
confidentiality of a guest.
In order to support this new CPUID bit move the LFENCE_DISPATCH
synthetic CPUID bit to map the hardware bit (leaving a hole in the
synthetic range) and either rely on the bit already being set by the
native CPUID output, or attempt to fake it in Xen by modifying the
DE_CFG MSR. This requires adding one more entry to the featureset to
support leaf 80000021.eax.
The bit is exposed to guests by default if the underlying hardware
supports leaf 80000021, as a way to signal that LFENCE is always
serializing. Hardware that doesn't have the leaf might also get the
bit set because Xen has performed the necessary arrangements, but
that's not yet exposed to guests. Note that Xen doesn't allow guests
to change the DE_CFG value, so once set by Xen LFENCE will always be
serializing.
Note that the access to DE_CFG by guests is left as-is: reads will
unconditionally return LFENCE_SERIALISE bit set, while writes are
silently dropped.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
- Rename to lfence+.
- Add feature to libxl_cpuid.c.
- Reword commit message.
---
Note this doesn't yet expose the bit on hardware that doesn't support
leaf 80000021. It's still TBD whether we want to hardcode this support
manually, or instead rely on a more general approach like the one
suggested by the shrink max CPUID leaf patch from Jan.
---
tools/libs/light/libxl_cpuid.c | 2 ++
tools/misc/xen-cpuid.c | 6 ++++
xen/arch/x86/cpu/amd.c | 7 ++++
xen/arch/x86/cpu/common.c | 3 ++
xen/include/asm-x86/cpufeatures.h | 2 +-
xen/include/public/arch-x86/cpufeatureset.h | 3 ++
xen/include/xen/lib/x86/cpuid.h | 37 ++++++++++++++++++++-
7 files changed, 58 insertions(+), 2 deletions(-)
diff --git a/tools/libs/light/libxl_cpuid.c b/tools/libs/light/libxl_cpuid.c
index aee28b0430d..d3ab66b9a71 100644
--- a/tools/libs/light/libxl_cpuid.c
+++ b/tools/libs/light/libxl_cpuid.c
@@ -284,6 +284,8 @@ int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str)
{"svm_decode", 0x8000000a, NA, CPUID_REG_EDX, 7, 1},
{"svm_pausefilt",0x8000000a, NA, CPUID_REG_EDX, 10, 1},
+ {"lfence+", 0x80000021, NA, CPUID_REG_EAX, 2, 1},
+
{"maxhvleaf", 0x40000000, NA, CPUID_REG_EAX, 0, 8},
{NULL, 0, NA, CPUID_REG_INV, 0, 0}
diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
index 628e8f5aa25..9a47237f4a8 100644
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -179,6 +179,11 @@ static const char *const str_7a1[32] =
[ 4] = "avx-vnni", [ 5] = "avx512-bf16",
};
+static const char *const str_e21a[32] =
+{
+ [ 2] = "lfence+",
+};
+
static const struct {
const char *name;
const char *abbr;
@@ -196,6 +201,7 @@ static const struct {
{ "0x80000008.ebx", "e8b", str_e8b },
{ "0x00000007:0.edx", "7d0", str_7d0 },
{ "0x00000007:1.eax", "7a1", str_7a1 },
+ { "0x80000021.eax", "e21a", str_e21a },
};
#define COL_ALIGN "18"
diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 9c8dcd91eef..35f22c24762 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -646,6 +646,13 @@ void amd_init_lfence(struct cpuinfo_x86 *c)
{
uint64_t value;
+ /*
+ * Some hardware has LFENCE dispatch serializing always enabled,
+ * nothing to do on that case.
+ */
+ if (test_bit(X86_FEATURE_LFENCE_DISPATCH, c->x86_capability))
+ return;
+
/*
* Attempt to set lfence to be Dispatch Serialising. This MSR almost
* certainly isn't virtualised (and Xen at least will leak the real
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index e5c3caf41d5..0eb364f8a65 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -412,6 +412,9 @@ static void generic_identify(struct cpuinfo_x86 *c)
if (c->extended_cpuid_level >= 0x80000008)
c->x86_capability[cpufeat_word(X86_FEATURE_CLZERO)]
= cpuid_ebx(0x80000008);
+ if (c->extended_cpuid_level >= 0x80000021)
+ c->x86_capability[cpufeat_word(X86_FEATURE_LFENCE_DISPATCH)]
+ = cpuid_eax(0x80000021);
/* Intel-defined flags: level 0x00000007 */
if ( c->cpuid_level >= 0x00000007 ) {
diff --git a/xen/include/asm-x86/cpufeatures.h b/xen/include/asm-x86/cpufeatures.h
index d7e42d9bb6a..6c8f432aee4 100644
--- a/xen/include/asm-x86/cpufeatures.h
+++ b/xen/include/asm-x86/cpufeatures.h
@@ -24,7 +24,7 @@ XEN_CPUFEATURE(APERFMPERF, X86_SYNTH( 8)) /* APERFMPERF */
XEN_CPUFEATURE(MFENCE_RDTSC, X86_SYNTH( 9)) /* MFENCE synchronizes RDTSC */
XEN_CPUFEATURE(XEN_SMEP, X86_SYNTH(10)) /* SMEP gets used by Xen itself */
XEN_CPUFEATURE(XEN_SMAP, X86_SYNTH(11)) /* SMAP gets used by Xen itself */
-XEN_CPUFEATURE(LFENCE_DISPATCH, X86_SYNTH(12)) /* lfence set as Dispatch Serialising */
+/* Bit 12 - unused. */
XEN_CPUFEATURE(IND_THUNK_LFENCE, X86_SYNTH(13)) /* Use IND_THUNK_LFENCE */
XEN_CPUFEATURE(IND_THUNK_JMP, X86_SYNTH(14)) /* Use IND_THUNK_JMP */
XEN_CPUFEATURE(SC_BRANCH_HARDEN, X86_SYNTH(15)) /* Conditional Branch Hardening */
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 42bc8d4279d..732990f2cc0 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -278,6 +278,9 @@ XEN_CPUFEATURE(SSBD, 9*32+31) /*A MSR_SPEC_CTRL.SSBD available */
XEN_CPUFEATURE(AVX_VNNI, 10*32+ 4) /*A AVX-VNNI Instructions */
XEN_CPUFEATURE(AVX512_BF16, 10*32+ 5) /*A AVX512 BFloat16 Instructions */
+/* AMD-defined CPU features, CPUID level 0x80000021.eax, word 11 */
+XEN_CPUFEATURE(LFENCE_DISPATCH, 11*32+ 2) /*A LFENCE always serializing */
+
#endif /* XEN_CPUFEATURE */
/* Clean up from a default include. Close the enum (for C). */
diff --git a/xen/include/xen/lib/x86/cpuid.h b/xen/include/xen/lib/x86/cpuid.h
index f4ef8a9f2f0..a4d254ea96e 100644
--- a/xen/include/xen/lib/x86/cpuid.h
+++ b/xen/include/xen/lib/x86/cpuid.h
@@ -15,6 +15,7 @@
#define FEATURESET_e8b 8 /* 0x80000008.ebx */
#define FEATURESET_7d0 9 /* 0x00000007:0.edx */
#define FEATURESET_7a1 10 /* 0x00000007:1.eax */
+#define FEATURESET_e21a 11 /* 0x80000021.eax */
struct cpuid_leaf
{
@@ -84,7 +85,7 @@ const char *x86_cpuid_vendor_to_str(unsigned int vendor);
#define CPUID_GUEST_NR_TOPO (1u + 1)
#define CPUID_GUEST_NR_XSTATE (62u + 1)
#define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
-#define CPUID_GUEST_NR_EXTD_AMD (0x1cu + 1)
+#define CPUID_GUEST_NR_EXTD_AMD (0x21u + 1)
#define CPUID_GUEST_NR_EXTD MAX(CPUID_GUEST_NR_EXTD_INTEL, \
CPUID_GUEST_NR_EXTD_AMD)
@@ -264,6 +265,38 @@ struct cpuid_policy
};
uint32_t nc:8, :4, apic_id_size:4, :16;
uint32_t /* d */:32;
+
+ uint64_t :64, :64; /* Leaf 0x80000009. */
+ uint64_t :64, :64; /* Leaf 0x8000000a - SVM rev and features. */
+ uint64_t :64, :64; /* Leaf 0x8000000b. */
+ uint64_t :64, :64; /* Leaf 0x8000000c. */
+ uint64_t :64, :64; /* Leaf 0x8000000d. */
+ uint64_t :64, :64; /* Leaf 0x8000000e. */
+ uint64_t :64, :64; /* Leaf 0x8000000f. */
+ uint64_t :64, :64; /* Leaf 0x80000010. */
+ uint64_t :64, :64; /* Leaf 0x80000011. */
+ uint64_t :64, :64; /* Leaf 0x80000012. */
+ uint64_t :64, :64; /* Leaf 0x80000013. */
+ uint64_t :64, :64; /* Leaf 0x80000014. */
+ uint64_t :64, :64; /* Leaf 0x80000015. */
+ uint64_t :64, :64; /* Leaf 0x80000016. */
+ uint64_t :64, :64; /* Leaf 0x80000017. */
+ uint64_t :64, :64; /* Leaf 0x80000018. */
+ uint64_t :64, :64; /* Leaf 0x80000019 - TLB 1GB Identifiers. */
+ uint64_t :64, :64; /* Leaf 0x8000001a - Performance related info. */
+ uint64_t :64, :64; /* Leaf 0x8000001b - IBS feature information. */
+ uint64_t :64, :64; /* Leaf 0x8000001c. */
+ uint64_t :64, :64; /* Leaf 0x8000001d - Cache properties. */
+ uint64_t :64, :64; /* Leaf 0x8000001e - Extd APIC/Core/Node IDs. */
+ uint64_t :64, :64; /* Leaf 0x8000001f - AMD Secure Encryption. */
+ uint64_t :64, :64; /* Leaf 0x80000020 - Platform QoS. */
+
+ /* Leaf 0x80000021 - Extended Feature 2 */
+ union {
+ uint32_t e21a;
+ struct { DECL_BITFIELD(e21a); };
+ };
+ uint32_t /* b */:32, /* c */:32, /* d */:32;
};
} extd;
@@ -293,6 +326,7 @@ static inline void cpuid_policy_to_featureset(
fs[FEATURESET_e8b] = p->extd.e8b;
fs[FEATURESET_7d0] = p->feat._7d0;
fs[FEATURESET_7a1] = p->feat._7a1;
+ fs[FEATURESET_e21a] = p->extd.e21a;
}
/* Fill in a CPUID policy from a featureset bitmap. */
@@ -310,6 +344,7 @@ static inline void cpuid_featureset_to_policy(
p->extd.e8b = fs[FEATURESET_e8b];
p->feat._7d0 = fs[FEATURESET_7d0];
p->feat._7a1 = fs[FEATURESET_7a1];
+ p->extd.e21a = fs[FEATURESET_e21a];
}
static inline uint64_t cpuid_policy_xcr0_max(const struct cpuid_policy *p)
--
2.30.1
On 15.04.2021 16:47, Roger Pau Monne wrote: > AMD Milan (Zen3) CPUs have an LFENCE Always Serializing CPUID bit in > leaf 80000021.eax. Previous AMD versions used to have a user settable > bit in DE_CFG MSR to select whether LFENCE was dispatch serializing, > which Xen always attempts to set. The forcefully always on setting is > due to the addition of SEV-SNP so that a VMM cannot break the > confidentiality of a guest. > > In order to support this new CPUID bit move the LFENCE_DISPATCH > synthetic CPUID bit to map the hardware bit (leaving a hole in the > synthetic range) and either rely on the bit already being set by the > native CPUID output, or attempt to fake it in Xen by modifying the > DE_CFG MSR. This requires adding one more entry to the featureset to > support leaf 80000021.eax. > > The bit is exposed to guests by default if the underlying hardware > supports leaf 80000021, as a way to signal that LFENCE is always > serializing. Hardware that doesn't have the leaf might also get the > bit set because Xen has performed the necessary arrangements, but > that's not yet exposed to guests. Note that Xen doesn't allow guests > to change the DE_CFG value, so once set by Xen LFENCE will always be > serializing. > > Note that the access to DE_CFG by guests is left as-is: reads will > unconditionally return LFENCE_SERIALISE bit set, while writes are > silently dropped. > > Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> > --- > Note this doesn't yet expose the bit on hardware that doesn't support > leaf 80000021. It's still TBD whether we want to hardcode this support > manually, or instead rely on a more general approach like the one > suggested by the shrink max CPUID leaf patch from Jan. I'd like to give Andrew a day or two more to respond there in case he continues to see an issue, before I commit that with your R-b and this one here. I'll assume you'll subsequently take care of that missing piece then - if not, i.e. if e.g. I should, please let me know. Jan
On Tue, Apr 20, 2021 at 12:35:54PM +0200, Jan Beulich wrote: > On 15.04.2021 16:47, Roger Pau Monne wrote: > > AMD Milan (Zen3) CPUs have an LFENCE Always Serializing CPUID bit in > > leaf 80000021.eax. Previous AMD versions used to have a user settable > > bit in DE_CFG MSR to select whether LFENCE was dispatch serializing, > > which Xen always attempts to set. The forcefully always on setting is > > due to the addition of SEV-SNP so that a VMM cannot break the > > confidentiality of a guest. > > > > In order to support this new CPUID bit move the LFENCE_DISPATCH > > synthetic CPUID bit to map the hardware bit (leaving a hole in the > > synthetic range) and either rely on the bit already being set by the > > native CPUID output, or attempt to fake it in Xen by modifying the > > DE_CFG MSR. This requires adding one more entry to the featureset to > > support leaf 80000021.eax. > > > > The bit is exposed to guests by default if the underlying hardware > > supports leaf 80000021, as a way to signal that LFENCE is always > > serializing. Hardware that doesn't have the leaf might also get the > > bit set because Xen has performed the necessary arrangements, but > > that's not yet exposed to guests. Note that Xen doesn't allow guests > > to change the DE_CFG value, so once set by Xen LFENCE will always be > > serializing. > > > > Note that the access to DE_CFG by guests is left as-is: reads will > > unconditionally return LFENCE_SERIALISE bit set, while writes are > > silently dropped. > > > > Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > Reviewed-by: Jan Beulich <jbeulich@suse.com> > > > --- > > Note this doesn't yet expose the bit on hardware that doesn't support > > leaf 80000021. It's still TBD whether we want to hardcode this support > > manually, or instead rely on a more general approach like the one > > suggested by the shrink max CPUID leaf patch from Jan. > > I'd like to give Andrew a day or two more to respond there in case he > continues to see an issue, before I commit that with your R-b and this > one here. I'll assume you'll subsequently take care of that missing > piece then - if not, i.e. if e.g. I should, please let me know. I think it should be something like the above, in fact I think it would be perfectly fine to merge that chunk into your patch? Thanks, Roger. ---8<--- diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c index 050cd5713e2..daf501779fe 100644 --- a/xen/arch/x86/cpuid.c +++ b/xen/arch/x86/cpuid.c @@ -314,12 +314,9 @@ static void __init calculate_host_policy(void) *p = raw_cpuid_policy; - p->basic.max_leaf = - min_t(uint32_t, p->basic.max_leaf, ARRAY_SIZE(p->basic.raw) - 1); - p->feat.max_subleaf = - min_t(uint32_t, p->feat.max_subleaf, ARRAY_SIZE(p->feat.raw) - 1); - p->extd.max_leaf = 0x80000000 | min_t(uint32_t, p->extd.max_leaf & 0xffff, - ARRAY_SIZE(p->extd.raw) - 1); + p->basic.max_leaf = ARRAY_SIZE(p->basic.raw) - 1; + p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1; + p->extd.max_leaf = 0x80000000 | ARRAY_SIZE(p->extd.raw) - 1; cpuid_featureset_to_policy(boot_cpu_data.x86_capability, p); recalculate_xstate(p);
On 20.04.2021 12:47, Roger Pau Monné wrote: > On Tue, Apr 20, 2021 at 12:35:54PM +0200, Jan Beulich wrote: >> I'd like to give Andrew a day or two more to respond there in case he >> continues to see an issue, before I commit that with your R-b and this >> one here. I'll assume you'll subsequently take care of that missing >> piece then - if not, i.e. if e.g. I should, please let me know. > > I think it should be something like the above, Right (assuming you meant "below). > in fact I think it > would be perfectly fine to merge that chunk into your patch? I'd rather not, so that this change can have its own reasoning in its description. Jan > diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c > index 050cd5713e2..daf501779fe 100644 > --- a/xen/arch/x86/cpuid.c > +++ b/xen/arch/x86/cpuid.c > @@ -314,12 +314,9 @@ static void __init calculate_host_policy(void) > > *p = raw_cpuid_policy; > > - p->basic.max_leaf = > - min_t(uint32_t, p->basic.max_leaf, ARRAY_SIZE(p->basic.raw) - 1); > - p->feat.max_subleaf = > - min_t(uint32_t, p->feat.max_subleaf, ARRAY_SIZE(p->feat.raw) - 1); > - p->extd.max_leaf = 0x80000000 | min_t(uint32_t, p->extd.max_leaf & 0xffff, > - ARRAY_SIZE(p->extd.raw) - 1); > + p->basic.max_leaf = ARRAY_SIZE(p->basic.raw) - 1; > + p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1; > + p->extd.max_leaf = 0x80000000 | ARRAY_SIZE(p->extd.raw) - 1; > > cpuid_featureset_to_policy(boot_cpu_data.x86_capability, p); > recalculate_xstate(p); >
On 15/04/2021 15:47, Roger Pau Monne wrote: > AMD Milan (Zen3) CPUs have an LFENCE Always Serializing CPUID bit in > leaf 80000021.eax. Previous AMD versions used to have a user settable > bit in DE_CFG MSR to select whether LFENCE was dispatch serializing, > which Xen always attempts to set. The forcefully always on setting is > due to the addition of SEV-SNP so that a VMM cannot break the > confidentiality of a guest. > > In order to support this new CPUID bit move the LFENCE_DISPATCH > synthetic CPUID bit to map the hardware bit (leaving a hole in the > synthetic range) and either rely on the bit already being set by the > native CPUID output, or attempt to fake it in Xen by modifying the > DE_CFG MSR. This requires adding one more entry to the featureset to > support leaf 80000021.eax. > > The bit is exposed to guests by default if the underlying hardware > supports leaf 80000021, as a way to signal that LFENCE is always > serializing. Hardware that doesn't have the leaf might also get the > bit set because Xen has performed the necessary arrangements, but > that's not yet exposed to guests. Note that Xen doesn't allow guests > to change the DE_CFG value, so once set by Xen LFENCE will always be > serializing. > > Note that the access to DE_CFG by guests is left as-is: reads will > unconditionally return LFENCE_SERIALISE bit set, while writes are > silently dropped. > > Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > Changes since v1: > - Rename to lfence+. > - Add feature to libxl_cpuid.c. > - Reword commit message. > --- > Note this doesn't yet expose the bit on hardware that doesn't support > leaf 80000021. It's still TBD whether we want to hardcode this support > manually, or instead rely on a more general approach like the one > suggested by the shrink max CPUID leaf patch from Jan. I'm going to insist on using the manual approach. Upping max leaf is strictly opposite to shrinking logic. It's very rare that we'll want to extend max leaf beyond what hardware supports, and it wants calling out clearly, along with identifying why it is safe to do so in this specific case. It is not safe or sensible to blindly escalate to the compile time max. The only cases where the differ are bugs needing fixing - the manual approach has the special case clearly called out, while the blindly escalate case has the bug hidden in derivation logic somewhere else. ~Andrew
© 2016 - 2024 Red Hat, Inc.