[PATCH v8 11/29] KVM: arm64: Document the KVM ABI for SME

Mark Brown posted 29 patches 5 months, 1 week ago
There is a newer version of this series
[PATCH v8 11/29] KVM: arm64: Document the KVM ABI for SME
Posted by Mark Brown 5 months, 1 week ago
SME, the Scalable Matrix Extension, is an arm64 extension which adds
support for matrix operations, with core concepts patterned after SVE.

SVE introduced some complication in the ABI since it adds new vector
floating point registers with runtime configurable size, the size being
controlled by a parameter called the vector length (VL). To provide control
of this to VMMs we offer two phase configuration of SVE, SVE must first be
enabled for the vCPU with KVM_ARM_VCPU_INIT(KVM_ARM_VCPU_SVE), after which
vector length may then be configured but the configurably sized floating
point registers are inaccessible until finalized with a call to
KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE) after which the configurably sized
registers can be accessed.

SME introduces an additional independent configurable vector length
which as well as controlling the size of the new ZA register also
provides an alternative view of the configurably sized SVE registers
(known as streaming mode) with the guest able to switch between the two
modes as it pleases.  There is also a fixed sized register ZT0
introduced in SME2. As well as streaming mode the guest may enable and
disable ZA and (where SME2 is available) ZT0 dynamically independently
of streaming mode. These modes are controlled via the system register
SVCR.

We handle the configuration of the vector length for SME in a similar
manner to SVE, requiring initialization and finalization of the feature
with a pseudo register controlling the available SME vector lengths as for
SVE. Further, if the guest has both SVE and SME then finalizing one
prevents further configuration of the vector length for the other.

Where both SVE and SME are configured for the guest we always present
the SVE registers to userspace as having the larger of the configured
maximum SVE and SME vector lengths, discarding extra data at load time
and zero padding on read as required if the active vector length is
lower. Note that this means that enabling or disabling streaming mode
while the guest is stopped will not zero Zn or Pn as it will when the
guest is running, but it does allow SVCR, Zn and Pn to be read and
written in any order.

Userspace access to ZA and (if configured) ZT0 is always available, they
will be zeroed when the guest runs if disabled in SVCR and the value
read will be zero if the guest stops with them disabled. This mirrors
the behaviour of the architecture, enabling access causes ZA and ZT0 to
be zeroed, while allowing access to SVCR, ZA and ZT0 to be performed in
any order.

If SME is enabled for a guest without SVE then the FPSIMD Vn registers
must be accessed via the low 128 bits of the SVE Zn registers as is the
case when SVE is enabled. This is not ideal but allows access to SVCR and
the registers in any order without duplication or ambiguity about which
values should take effect. This may be an issue for VMMs that are
unaware of SME on systems that implement it without SVE if they let SME
be enabled, the lack of access to Vn may surprise them, but it seems
like an unusual implementation choice.

For SME unware VMMs on systems with both SVE and SME support the SVE
registers may be larger than expected, this should be less disruptive
than on a system without SVE as they will simply ignore the high bits of
the registers.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 Documentation/virt/kvm/api.rst | 115 +++++++++++++++++++++++++++++------------
 1 file changed, 81 insertions(+), 34 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 6aa40ee05a4a..94a22407a1d4 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -406,7 +406,7 @@ Errors:
              instructions from device memory (arm64)
   ENOSYS     data abort outside memslots with no syndrome info and
              KVM_CAP_ARM_NISV_TO_USER not enabled (arm64)
-  EPERM      SVE feature set but not finalized (arm64)
+  EPERM      SVE or SME feature set but not finalized (arm64)
   =======    ==============================================================
 
 This ioctl is used to run a guest virtual cpu.  While there are no
@@ -2601,11 +2601,11 @@ Specifically:
 ======================= ========= ===== =======================================
 
 .. [1] These encodings are not accepted for SVE-enabled vcpus.  See
-       :ref:`KVM_ARM_VCPU_INIT`.
+       :ref:`KVM_ARM_VCPU_INIT`.  They are also not accepted when SME is
+       enabled without SVE and the vcpu is in streaming mode.
 
        The equivalent register content can be accessed via bits [127:0] of
-       the corresponding SVE Zn registers instead for vcpus that have SVE
-       enabled (see below).
+       the corresponding SVE Zn registers in these cases (see below).
 
 arm64 CCSIDR registers are demultiplexed by CSSELR value::
 
@@ -2636,24 +2636,34 @@ arm64 SVE registers have the following bit patterns::
   0x6050 0000 0015 060 <slice:5>        FFR bits[256*slice + 255 : 256*slice]
   0x6060 0000 0015 ffff                 KVM_REG_ARM64_SVE_VLS pseudo-register
 
-Access to register IDs where 2048 * slice >= 128 * max_vq will fail with
-ENOENT.  max_vq is the vcpu's maximum supported vector length in 128-bit
-quadwords: see [2]_ below.
+arm64 SME registers have the following bit patterns:
+
+  0x6080 0000 0017 00 <n:5> <slice:5>   ZA.H[n] bits[2048*slice + 2047 : 2048*slice]
+  0x60XX 0000 0017 0100                 ZT0
+  0x6060 0000 0017 fffe                 KVM_REG_ARM64_SME_VLS pseudo-register
+
+Access to Z, P or ZA register IDs where 2048 * slice >= 128 * max_vq
+will fail with ENOENT.  max_vq is the vcpu's maximum supported vector
+length in 128-bit quadwords: see [2]_ below.
+
+Access to the ZA and ZT0 registers is only available if SVCR.ZA is set
+to 1.
 
 These registers are only accessible on vcpus for which SVE is enabled.
 See KVM_ARM_VCPU_INIT for details.
 
-In addition, except for KVM_REG_ARM64_SVE_VLS, these registers are not
-accessible until the vcpu's SVE configuration has been finalized
-using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE).  See KVM_ARM_VCPU_INIT
-and KVM_ARM_VCPU_FINALIZE for more information about this procedure.
+In addition, except for KVM_REG_ARM64_SVE_VLS and
+KVM_REG_ARM64_SME_VLS, these registers are not accessible until the
+vcpu's SVE and SME configuration has been finalized using
+KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC).  See KVM_ARM_VCPU_INIT and
+KVM_ARM_VCPU_FINALIZE for more information about this procedure.
 
-KVM_REG_ARM64_SVE_VLS is a pseudo-register that allows the set of vector
-lengths supported by the vcpu to be discovered and configured by
-userspace.  When transferred to or from user memory via KVM_GET_ONE_REG
-or KVM_SET_ONE_REG, the value of this register is of type
-__u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the set of vector lengths as
-follows::
+KVM_REG_ARM64_SVE_VLS and KVM_ARM64_VCPU_SME_VLS are pseudo-registers
+that allows the set of vector lengths supported by the vcpu to be
+discovered and configured by userspace.  When transferred to or from
+user memory via KVM_GET_ONE_REG or KVM_SET_ONE_REG, the value of this
+register is of type __u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the
+set of vector lengths as follows::
 
   __u64 vector_lengths[KVM_ARM64_SVE_VLS_WORDS];
 
@@ -2665,19 +2675,25 @@ follows::
 	/* Vector length vq * 16 bytes not supported */
 
 .. [2] The maximum value vq for which the above condition is true is
-       max_vq.  This is the maximum vector length available to the guest on
-       this vcpu, and determines which register slices are visible through
-       this ioctl interface.
+       max_vq.  This is the maximum vector length currently available to
+       the guest on this vcpu, and determines which register slices are
+       visible through this ioctl interface.
+
+       If SME is supported then the max_vq used for the Z and P registers
+       while SVCR.SM is 1 this vector length will be the maximum SME
+       vector length available for the guest, otherwise it will be the
+       maximum SVE vector length available.
 
 (See Documentation/arch/arm64/sve.rst for an explanation of the "vq"
 nomenclature.)
 
-KVM_REG_ARM64_SVE_VLS is only accessible after KVM_ARM_VCPU_INIT.
-KVM_ARM_VCPU_INIT initialises it to the best set of vector lengths that
-the host supports.
+KVM_REG_ARM64_SVE_VLS and KVM_REG_ARM_SME_VLS are only accessible
+after KVM_ARM_VCPU_INIT.  KVM_ARM_VCPU_INIT initialises them to the
+best set of vector lengths that the host supports.
 
-Userspace may subsequently modify it if desired until the vcpu's SVE
-configuration is finalized using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE).
+Userspace may subsequently modify these registers if desired until the
+vcpu's SVE and SME configuration is finalized using
+KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC).
 
 Apart from simply removing all vector lengths from the host set that
 exceed some value, support for arbitrarily chosen sets of vector lengths
@@ -2685,8 +2701,8 @@ is hardware-dependent and may not be available.  Attempting to configure
 an invalid set of vector lengths via KVM_SET_ONE_REG will fail with
 EINVAL.
 
-After the vcpu's SVE configuration is finalized, further attempts to
-write this register will fail with EPERM.
+After the vcpu's SVE or SME configuration is finalized, further
+attempts to write these registers will fail with EPERM.
 
 arm64 bitmap feature firmware pseudo-registers have the following bit pattern::
 
@@ -3469,6 +3485,7 @@ The initial values are defined as:
 	- General Purpose registers, including PC and SP: set to 0
 	- FPSIMD/NEON registers: set to 0
 	- SVE registers: set to 0
+	- SME registers: set to 0
 	- System registers: Reset to their architecturally defined
 	  values as for a warm reset to EL1 (resp. SVC) or EL2 (in the
 	  case of EL2 being enabled).
@@ -3512,7 +3529,7 @@ Possible features:
 
 	- KVM_ARM_VCPU_SVE: Enables SVE for the CPU (arm64 only).
 	  Depends on KVM_CAP_ARM_SVE.
-	  Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE):
+	  Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC):
 
 	   * After KVM_ARM_VCPU_INIT:
 
@@ -3520,7 +3537,7 @@ Possible features:
 	        initial value of this pseudo-register indicates the best set of
 	        vector lengths possible for a vcpu on this host.
 
-	   * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE):
+	   * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC):
 
 	      - KVM_RUN and KVM_GET_REG_LIST are not available;
 
@@ -3533,11 +3550,40 @@ Possible features:
 	        KVM_SET_ONE_REG, to modify the set of vector lengths available
 	        for the vcpu.
 
-	   * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE):
+	   * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC):
 
 	      - the KVM_REG_ARM64_SVE_VLS pseudo-register is immutable, and can
 	        no longer be written using KVM_SET_ONE_REG.
 
+	- KVM_ARM_VCPU_SME: Enables SME for the CPU (arm64 only).
+	  Depends on KVM_CAP_ARM_SME.
+	  Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC):
+
+	   * After KVM_ARM_VCPU_INIT:
+
+	      - KVM_REG_ARM64_SME_VLS may be read using KVM_GET_ONE_REG: the
+	        initial value of this pseudo-register indicates the best set of
+	        vector lengths possible for a vcpu on this host.
+
+	   * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC):
+
+	      - KVM_RUN and KVM_GET_REG_LIST are not available;
+
+	      - KVM_GET_ONE_REG and KVM_SET_ONE_REG cannot be used to access
+	        the scalable architectural SVE registers
+	        KVM_REG_ARM64_SVE_ZREG(), KVM_REG_ARM64_SVE_PREG() or
+	        KVM_REG_ARM64_SVE_FFR, the matrix register
+		KVM_REG_ARM64_SME_ZA() or the LUT register KVM_REG_ARM64_ZT();
+
+	      - KVM_REG_ARM64_SME_VLS may optionally be written using
+	        KVM_SET_ONE_REG, to modify the set of vector lengths available
+	        for the vcpu.
+
+	   * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC):
+
+	      - the KVM_REG_ARM64_SME_VLS pseudo-register is immutable, and can
+	        no longer be written using KVM_SET_ONE_REG.
+
 	- KVM_ARM_VCPU_HAS_EL2: Enable Nested Virtualisation support,
 	  booting the guest from EL2 instead of EL1.
 	  Depends on KVM_CAP_ARM_EL2.
@@ -5120,11 +5166,12 @@ Errors:
 
 Recognised values for feature:
 
-  =====      ===========================================
-  arm64      KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE)
-  =====      ===========================================
+  =====      ==============================================================
+  arm64      KVM_ARM_VCPU_VEC (requires KVM_CAP_ARM_SVE or KVM_CAP_ARM_SME)
+  arm64      KVM_ARM_VCPU_SVE (alias for KVM_ARM_VCPU_VEC)
+  =====      ==============================================================
 
-Finalizes the configuration of the specified vcpu feature.
+Finalizes the configuration of the specified vcpu features.
 
 The vcpu must already have been initialised, enabling the affected feature, by
 means of a successful :ref:`KVM_ARM_VCPU_INIT <KVM_ARM_VCPU_INIT>` call with the

-- 
2.39.5
Re: [PATCH v8 11/29] KVM: arm64: Document the KVM ABI for SME
Posted by Peter Maydell 2 months, 2 weeks ago
On Tue, 2 Sept 2025 at 12:45, Mark Brown <broonie@kernel.org> wrote:
>
> SME, the Scalable Matrix Extension, is an arm64 extension which adds
> support for matrix operations, with core concepts patterned after SVE.

Hi; apologies for not having got round to looking at this earlier.

I haven't actually tried writing any code that uses this proposed
ABI, but mostly it looks OK to me. I have a few nits below, but
my main concern is the bits of text that say (or seem to say --
maybe I'm misinterpreting them) that various parts of how userspace
accesses the guest state (e.g. the fp regs) depend on the current
state of the vcpu, rather than being only a function of how the
vcpu was configured. That seems to me like it's unnecessarily awkward.
(More detail below.)

> If SME is enabled for a guest without SVE then the FPSIMD Vn registers
> must be accessed via the low 128 bits of the SVE Zn registers as is the
> case when SVE is enabled. This is not ideal but allows access to SVCR and
> the registers in any order without duplication or ambiguity about which
> values should take effect. This may be an issue for VMMs that are
> unaware of SME on systems that implement it without SVE if they let SME
> be enabled, the lack of access to Vn may surprise them, but it seems
> like an unusual implementation choice.
>
> For SME unware VMMs on systems with both SVE and SME support the SVE
> registers may be larger than expected, this should be less disruptive
> than on a system without SVE as they will simply ignore the high bits of
> the registers.

I think that since enabling SME is something the VMM has to actively
do, it isn't a big deal that they also need to do something in the
fp or sve register access codepaths to handle SME. You can't get
SME by surprise (same as you can't get SVE by surprise).

> Signed-off-by: Mark Brown <broonie@kernel.org>
> ---
>  Documentation/virt/kvm/api.rst | 115 +++++++++++++++++++++++++++++------------
>  1 file changed, 81 insertions(+), 34 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 6aa40ee05a4a..94a22407a1d4 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -406,7 +406,7 @@ Errors:
>               instructions from device memory (arm64)
>    ENOSYS     data abort outside memslots with no syndrome info and
>               KVM_CAP_ARM_NISV_TO_USER not enabled (arm64)
> -  EPERM      SVE feature set but not finalized (arm64)
> +  EPERM      SVE or SME feature set but not finalized (arm64)
>    =======    ==============================================================
>
>  This ioctl is used to run a guest virtual cpu.  While there are no
> @@ -2601,11 +2601,11 @@ Specifically:
>  ======================= ========= ===== =======================================
>
>  .. [1] These encodings are not accepted for SVE-enabled vcpus.  See
> -       :ref:`KVM_ARM_VCPU_INIT`.
> +       :ref:`KVM_ARM_VCPU_INIT`.  They are also not accepted when SME is
> +       enabled without SVE and the vcpu is in streaming mode.

Does this mean that on an SME-no-SVE VM the VMM needs to know
if the vcpu is currently in streaming mode or not to determine
whether to read the FP registers as fp_regs or sve regs? That
seems unpleasant -- I was expecting this to be strictly a
matter of how the VM was configured (as it is with SVE).

>         The equivalent register content can be accessed via bits [127:0] of
> -       the corresponding SVE Zn registers instead for vcpus that have SVE
> -       enabled (see below).
> +       the corresponding SVE Zn registers in these cases (see below).
>
>  arm64 CCSIDR registers are demultiplexed by CSSELR value::
>
> @@ -2636,24 +2636,34 @@ arm64 SVE registers have the following bit patterns::
>    0x6050 0000 0015 060 <slice:5>        FFR bits[256*slice + 255 : 256*slice]
>    0x6060 0000 0015 ffff                 KVM_REG_ARM64_SVE_VLS pseudo-register
>
> -Access to register IDs where 2048 * slice >= 128 * max_vq will fail with
> -ENOENT.  max_vq is the vcpu's maximum supported vector length in 128-bit
> -quadwords: see [2]_ below.
> +arm64 SME registers have the following bit patterns:
> +
> +  0x6080 0000 0017 00 <n:5> <slice:5>   ZA.H[n] bits[2048*slice + 2047 : 2048*slice]
> +  0x60XX 0000 0017 0100                 ZT0

What's the XX here ?

> +  0x6060 0000 0017 fffe                 KVM_REG_ARM64_SME_VLS pseudo-register
> +
> +Access to Z, P or ZA register IDs where 2048 * slice >= 128 * max_vq
> +will fail with ENOENT.  max_vq is the vcpu's maximum supported vector
> +length in 128-bit quadwords: see [2]_ below.

What about FFR registers ? Is their ENOENT condition the same,
or different?

> +
> +Access to the ZA and ZT0 registers is only available if SVCR.ZA is set
> +to 1.
>
>  These registers are only accessible on vcpus for which SVE is enabled.
>  See KVM_ARM_VCPU_INIT for details.
>
> -In addition, except for KVM_REG_ARM64_SVE_VLS, these registers are not
> -accessible until the vcpu's SVE configuration has been finalized
> -using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE).  See KVM_ARM_VCPU_INIT
> -and KVM_ARM_VCPU_FINALIZE for more information about this procedure.
> +In addition, except for KVM_REG_ARM64_SVE_VLS and
> +KVM_REG_ARM64_SME_VLS, these registers are not accessible until the
> +vcpu's SVE and SME configuration has been finalized using
> +KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC).  See KVM_ARM_VCPU_INIT and
> +KVM_ARM_VCPU_FINALIZE for more information about this procedure.
>
> -KVM_REG_ARM64_SVE_VLS is a pseudo-register that allows the set of vector
> -lengths supported by the vcpu to be discovered and configured by
> -userspace.  When transferred to or from user memory via KVM_GET_ONE_REG
> -or KVM_SET_ONE_REG, the value of this register is of type
> -__u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the set of vector lengths as
> -follows::
> +KVM_REG_ARM64_SVE_VLS and KVM_ARM64_VCPU_SME_VLS are pseudo-registers
> +that allows the set of vector lengths supported by the vcpu to be
> +discovered and configured by userspace.  When transferred to or from
> +user memory via KVM_GET_ONE_REG or KVM_SET_ONE_REG, the value of this
> +register is of type __u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the
> +set of vector lengths as follows::
>
>    __u64 vector_lengths[KVM_ARM64_SVE_VLS_WORDS];
>
> @@ -2665,19 +2675,25 @@ follows::
>         /* Vector length vq * 16 bytes not supported */
>
>  .. [2] The maximum value vq for which the above condition is true is
> -       max_vq.  This is the maximum vector length available to the guest on
> -       this vcpu, and determines which register slices are visible through
> -       this ioctl interface.
> +       max_vq.  This is the maximum vector length currently available to
> +       the guest on this vcpu, and determines which register slices are
> +       visible through this ioctl interface.
> +
> +       If SME is supported then the max_vq used for the Z and P registers
> +       while SVCR.SM is 1 this vector length will be the maximum SME
> +       vector length available for the guest, otherwise it will be the
> +       maximum SVE vector length available.

I can't figure out what this paragraph is trying to say, partly
because it seems like it might be missing some text between
"is 1" and "this vector length".

In any case, the "while SVCR.SM is 1" part seems odd -- I
don't think this ABI should care about the runtime vcpu state,
only what the vcpu's max vector lengths were configured as.
My expectation would be that the max_vq for VMM register
access would be the maximum of the SVE and SME vector lengths
configured for the vcpu.

thanks
-- PMM
Re: [PATCH v8 11/29] KVM: arm64: Document the KVM ABI for SME
Posted by Mark Brown 2 months, 2 weeks ago
On Mon, Nov 24, 2025 at 03:48:06PM +0000, Peter Maydell wrote:
> On Tue, 2 Sept 2025 at 12:45, Mark Brown <broonie@kernel.org> wrote:

> > SME, the Scalable Matrix Extension, is an arm64 extension which adds
> > support for matrix operations, with core concepts patterned after SVE.

> I haven't actually tried writing any code that uses this proposed
> ABI, but mostly it looks OK to me. I have a few nits below, but
> my main concern is the bits of text that say (or seem to say --
> maybe I'm misinterpreting them) that various parts of how userspace
> accesses the guest state (e.g. the fp regs) depend on the current
> state of the vcpu, rather than being only a function of how the
> vcpu was configured. That seems to me like it's unnecessarily awkward.
> (More detail below.)

That was deliberate and I agree it is awkward, it was introduced as a
result of earlier review comments.  I had originally implemented an ABI
where the VL for the vector registers was the maximum of the SVE and SME
VLs but the feedback was that the ABI should instead follow what the
architecture does with the vector length and potentially presence of the
vector registers depending on the current streaming mode configuration.
It sounds like you would prefer something more like what was there
originally?

> > For SME unware VMMs on systems with both SVE and SME support the SVE
> > registers may be larger than expected, this should be less disruptive
> > than on a system without SVE as they will simply ignore the high bits of
> > the registers.

> I think that since enabling SME is something the VMM has to actively
> do, it isn't a big deal that they also need to do something in the
> fp or sve register access codepaths to handle SME. You can't get
> SME by surprise (same as you can't get SVE by surprise).

Yes, it's not going to affect anything without enabling it.  I can't
remember what that was in reference to, it clearly needs an update.

> >  .. [1] These encodings are not accepted for SVE-enabled vcpus.  See
> > -       :ref:`KVM_ARM_VCPU_INIT`.
> > +       :ref:`KVM_ARM_VCPU_INIT`.  They are also not accepted when SME is
> > +       enabled without SVE and the vcpu is in streaming mode.

> Does this mean that on an SME-no-SVE VM the VMM needs to know
> if the vcpu is currently in streaming mode or not to determine
> whether to read the FP registers as fp_regs or sve regs? That
> seems unpleasant -- I was expecting this to be strictly a
> matter of how the VM was configured (as it is with SVE).

Yes, it does.

> > +arm64 SME registers have the following bit patterns:

> > +  0x6080 0000 0017 00 <n:5> <slice:5>   ZA.H[n] bits[2048*slice + 2047 : 2048*slice]
> > +  0x60XX 0000 0017 0100                 ZT0

> What's the XX here ?

Sorry, will fill that in - thanks for spotting it.

> > +  0x6060 0000 0017 fffe                 KVM_REG_ARM64_SME_VLS pseudo-register
> > +
> > +Access to Z, P or ZA register IDs where 2048 * slice >= 128 * max_vq
> > +will fail with ENOENT.  max_vq is the vcpu's maximum supported vector
> > +length in 128-bit quadwords: see [2]_ below.

> What about FFR registers ? Is their ENOENT condition the same,
> or different?

It should be the same, will update to clarify.

> > +       max_vq.  This is the maximum vector length currently available to
> > +       the guest on this vcpu, and determines which register slices are
> > +       visible through this ioctl interface.

> > +       If SME is supported then the max_vq used for the Z and P registers
> > +       while SVCR.SM is 1 this vector length will be the maximum SME
> > +       vector length available for the guest, otherwise it will be the
> > +       maximum SVE vector length available.

> I can't figure out what this paragraph is trying to say, partly
> because it seems like it might be missing some text between
> "is 1" and "this vector length".

> In any case, the "while SVCR.SM is 1" part seems odd -- I
> don't think this ABI should care about the runtime vcpu state,
> only what the vcpu's max vector lengths were configured as.
> My expectation would be that the max_vq for VMM register
> access would be the maximum of the SVE and SME vector lengths
> configured for the vcpu.

This is attempting to say that the VL for the Z and P registers (and
FFR) will vary depending on if the vCPU is in streaming mode or not if
the maximum VL for SVE and SME differs, similarly to how the Z, P and
FFR registers disappear when we are not in streaming mode in a SME only
system.
Re: [PATCH v8 11/29] KVM: arm64: Document the KVM ABI for SME
Posted by Peter Maydell 2 months, 1 week ago
On Mon, 24 Nov 2025 at 20:13, Mark Brown <broonie@kernel.org> wrote:
>
> On Mon, Nov 24, 2025 at 03:48:06PM +0000, Peter Maydell wrote:
> > On Tue, 2 Sept 2025 at 12:45, Mark Brown <broonie@kernel.org> wrote:
>
> > > SME, the Scalable Matrix Extension, is an arm64 extension which adds
> > > support for matrix operations, with core concepts patterned after SVE.
>
> > I haven't actually tried writing any code that uses this proposed
> > ABI, but mostly it looks OK to me. I have a few nits below, but
> > my main concern is the bits of text that say (or seem to say --
> > maybe I'm misinterpreting them) that various parts of how userspace
> > accesses the guest state (e.g. the fp regs) depend on the current
> > state of the vcpu, rather than being only a function of how the
> > vcpu was configured. That seems to me like it's unnecessarily awkward.
> > (More detail below.)
>
> That was deliberate and I agree it is awkward, it was introduced as a
> result of earlier review comments.  I had originally implemented an ABI
> where the VL for the vector registers was the maximum of the SVE and SME
> VLs but the feedback was that the ABI should instead follow what the
> architecture does with the vector length and potentially presence of the
> vector registers depending on the current streaming mode configuration.
> It sounds like you would prefer something more like what was there
> originally?

Yes, that's what I would prefer. The "varies by current CPU state"
approach seems to me to be not the way we do things right now,
and to be awkward for the VMM side, so it ought to have a really
strong justification for why we need it.

Generally the VMM doesn't care about the actual current state of the
CPU, it just wants all the data (e.g. to send for migration). We don't
make the current SVE accessors change based on what the current SVE
vq length is or whether the guest has set the SVE enable bits -- we
have "if the vcpu supports SVE at all, data is always accessed via
the SVE accessors, and it's always the max_vq length, regardless of
how the vcpu has set its current vq length".

What's the benefit of making the way KVM exposes the data
bounce around based on the current CPU state? Does that
make things easier for the kernel internally?

-- PMM
Re: [PATCH v8 11/29] KVM: arm64: Document the KVM ABI for SME
Posted by Mark Brown 2 months, 1 week ago
On Thu, Nov 27, 2025 at 03:06:50PM +0000, Peter Maydell wrote:
> On Mon, 24 Nov 2025 at 20:13, Mark Brown <broonie@kernel.org> wrote:

> > That was deliberate and I agree it is awkward, it was introduced as a
> > result of earlier review comments.  I had originally implemented an ABI
> > where the VL for the vector registers was the maximum of the SVE and SME
> > VLs but the feedback was that the ABI should instead follow what the
> > architecture does with the vector length and potentially presence of the
> > vector registers depending on the current streaming mode configuration.
> > It sounds like you would prefer something more like what was there
> > originally?

> Yes, that's what I would prefer. The "varies by current CPU state"
> approach seems to me to be not the way we do things right now,
> and to be awkward for the VMM side, so it ought to have a really
> strong justification for why we need it.

> Generally the VMM doesn't care about the actual current state of the
> CPU, it just wants all the data (e.g. to send for migration). We don't
> make the current SVE accessors change based on what the current SVE
> vq length is or whether the guest has set the SVE enable bits -- we
> have "if the vcpu supports SVE at all, data is always accessed via
> the SVE accessors, and it's always the max_vq length, regardless of
> how the vcpu has set its current vq length".

OK, that's clear - that was my expectation for what userspace would want
too FWIW.

> What's the benefit of making the way KVM exposes the data
> bounce around based on the current CPU state? Does that
> make things easier for the kernel internally?

Yes, it makes life easier for the kernel internally.  If we expose the
registers to userspace with a potentially non-native format then we need
to keep track of the format things are currently stored in and rewrite
the data between the format we're exposing and the format we're going to
load/save to/from the hardware when those differ.  It's not an issue in
the normal fast path for running guests, it's only a concern when
userspace actually interacts with the affected registers.

This won't have come up in the SVE case since what's exposed is the
hypervisor view of the registers which doesn't change based on what the
guest does so you just need a bit of configuration of ZCR_ELx.LEN, with
SME the current value of PSTATE.SM changes the view of the registers for
all ELs.
Re: [PATCH v8 11/29] KVM: arm64: Document the KVM ABI for SME
Posted by Dave Martin 2 months, 2 weeks ago
Hi,

On Mon, Nov 24, 2025 at 08:12:56PM +0000, Mark Brown wrote:
> On Mon, Nov 24, 2025 at 03:48:06PM +0000, Peter Maydell wrote:
> > On Tue, 2 Sept 2025 at 12:45, Mark Brown <broonie@kernel.org> wrote:
> 
> > > SME, the Scalable Matrix Extension, is an arm64 extension which adds
> > > support for matrix operations, with core concepts patterned after SVE.
> 
> > I haven't actually tried writing any code that uses this proposed
> > ABI, but mostly it looks OK to me. I have a few nits below, but
> > my main concern is the bits of text that say (or seem to say --
> > maybe I'm misinterpreting them) that various parts of how userspace
> > accesses the guest state (e.g. the fp regs) depend on the current
> > state of the vcpu, rather than being only a function of how the
> > vcpu was configured. That seems to me like it's unnecessarily awkward.
> > (More detail below.)
> 
> That was deliberate and I agree it is awkward, it was introduced as a
> result of earlier review comments.  I had originally implemented an ABI
> where the VL for the vector registers was the maximum of the SVE and SME
> VLs but the feedback was that the ABI should instead follow what the
> architecture does with the vector length and potentially presence of the
> vector registers depending on the current streaming mode configuration.
> It sounds like you would prefer something more like what was there
> originally?
> 
> > > For SME unware VMMs on systems with both SVE and SME support the SVE
> > > registers may be larger than expected, this should be less disruptive
> > > than on a system without SVE as they will simply ignore the high bits of
> > > the registers.
> 
> > I think that since enabling SME is something the VMM has to actively
> > do, it isn't a big deal that they also need to do something in the
> > fp or sve register access codepaths to handle SME. You can't get
> > SME by surprise (same as you can't get SVE by surprise).
> 
> Yes, it's not going to affect anything without enabling it.  I can't
> remember what that was in reference to, it clearly needs an update.
> 
> > >  .. [1] These encodings are not accepted for SVE-enabled vcpus.  See
> > > -       :ref:`KVM_ARM_VCPU_INIT`.
> > > +       :ref:`KVM_ARM_VCPU_INIT`.  They are also not accepted when SME is
> > > +       enabled without SVE and the vcpu is in streaming mode.
> 
> > Does this mean that on an SME-no-SVE VM the VMM needs to know
> > if the vcpu is currently in streaming mode or not to determine
> > whether to read the FP registers as fp_regs or sve regs? That
> > seems unpleasant -- I was expecting this to be strictly a
> > matter of how the VM was configured (as it is with SVE).
> 
> Yes, it does.

Ditto from me about not having looked at this earlier...


Is the above condition right re streaming mode?  The original reason
for this restriction was that the SVE Z-regs and FPSIMD V-regs are
aliases when SVE is present.  To avoid having to worry about how to
order register accesses and/or paste parts of them together, we went
down the road of banishing encodings that alias a subset of the
register state accessed by some other encoding.

In line with this principle, with SME Vn and Zn are aliases when
*not* in streaming mode, so allowing access through the Vn view feels
problematic too?  (And when in streaming mode, the Vn regs don't exist
at all.)

Whether the proposed ABI is considered awkward for VMMs or not is a
separate matter...)

> 
> > > +arm64 SME registers have the following bit patterns:
> 
> > > +  0x6080 0000 0017 00 <n:5> <slice:5>   ZA.H[n] bits[2048*slice + 2047 : 2048*slice]
> > > +  0x60XX 0000 0017 0100                 ZT0
> 
> > What's the XX here ?
> 
> Sorry, will fill that in - thanks for spotting it.
> 
> > > +  0x6060 0000 0017 fffe                 KVM_REG_ARM64_SME_VLS pseudo-register
> > > +
> > > +Access to Z, P or ZA register IDs where 2048 * slice >= 128 * max_vq
> > > +will fail with ENOENT.  max_vq is the vcpu's maximum supported vector
> > > +length in 128-bit quadwords: see [2]_ below.
> 
> > What about FFR registers ? Is their ENOENT condition the same,
> > or different?
> 
> It should be the same, will update to clarify.
>
> > > +       max_vq.  This is the maximum vector length currently available to
> > > +       the guest on this vcpu, and determines which register slices are
> > > +       visible through this ioctl interface.
> 
> > > +       If SME is supported then the max_vq used for the Z and P registers
> > > +       while SVCR.SM is 1 this vector length will be the maximum SME
> > > +       vector length available for the guest, otherwise it will be the
> > > +       maximum SVE vector length available.

The max_vq name here is not ABI; it's just linking concepts together in
the documentation text.

So, can we give explicitly different names to these two max_vq values?

Splitting the affected register descriptions into "SVCR.SM == 0" and
"SVCR.SM == 1" cases also be helpful to make this special-casing clear.

> 
> > I can't figure out what this paragraph is trying to say, partly
> > because it seems like it might be missing some text between
> > "is 1" and "this vector length".
> 
> > In any case, the "while SVCR.SM is 1" part seems odd -- I
> > don't think this ABI should care about the runtime vcpu state,
> > only what the vcpu's max vector lengths were configured as.
> > My expectation would be that the max_vq for VMM register
> > access would be the maximum of the SVE and SME vector lengths
> > configured for the vcpu.
> 
> This is attempting to say that the VL for the Z and P registers (and
> FFR) will vary depending on if the vCPU is in streaming mode or not if
> the maximum VL for SVE and SME differs, similarly to how the Z, P and
> FFR registers disappear when we are not in streaming mode in a SME only
> system.

May flipping SVCR.SM through KVM_SET_ONE_REG have the architectural
effect of zeroing the vector regs?  That feels like something that
should be stated explicitly.


Also, in general:

I'd agree that this mutating interface feels odd, and does not follow
the original spirit of the design here.

But the SME architecture doesn't fit well with the spirit of the
original KVM ABI here either, so I guess there won't be a perfect
solution.


It seems that when SME is enabled in the vCPU features and the VMM is
planning to dump or set affected registers, there is a requirement to
dump / set SVCR.SM first, and then go down one of two code paths.  Can
this be called out explicitly?  This is a departure from the the
previous interaction model, so it probably deserves its own section,
which can then be cross-referenced from individual reg
descriptions.

SVCR.SM exhibits this modality w.r.t a specific set of affected
register encodings; it would be good to have that captured clearly in
one place.

(This may or may not make life easier for VMMs -- I'll leave it to
Peter to comment on that!)

Cheers
---Dave
Re: [PATCH v8 11/29] KVM: arm64: Document the KVM ABI for SME
Posted by Mark Brown 2 months, 2 weeks ago
On Wed, Nov 26, 2025 at 05:23:47PM +0000, Dave Martin wrote:
> On Mon, Nov 24, 2025 at 08:12:56PM +0000, Mark Brown wrote:
> > On Mon, Nov 24, 2025 at 03:48:06PM +0000, Peter Maydell wrote:

> > > >  .. [1] These encodings are not accepted for SVE-enabled vcpus.  See
> > > > -       :ref:`KVM_ARM_VCPU_INIT`.
> > > > +       :ref:`KVM_ARM_VCPU_INIT`.  They are also not accepted when SME is
> > > > +       enabled without SVE and the vcpu is in streaming mode.

> > > Does this mean that on an SME-no-SVE VM the VMM needs to know
> > > if the vcpu is currently in streaming mode or not to determine
> > > whether to read the FP registers as fp_regs or sve regs? That
> > > seems unpleasant -- I was expecting this to be strictly a
> > > matter of how the VM was configured (as it is with SVE).

> > Yes, it does.

> Is the above condition right re streaming mode?  The original reason
> for this restriction was that the SVE Z-regs and FPSIMD V-regs are
> aliases when SVE is present.  To avoid having to worry about how to
> order register accesses and/or paste parts of them together, we went
> down the road of banishing encodings that alias a subset of the
> register state accessed by some other encoding.

I queried the issue with requiring that writes to the registers be done
in a specific order - we apparently have some other examples of this
already (I would need to go and check which specifically) so that was
seen as OK.

> In line with this principle, with SME Vn and Zn are aliases when
> *not* in streaming mode, so allowing access through the Vn view feels
> problematic too?  (And when in streaming mode, the Vn regs don't exist
> at all.)

The ABI proposed here is that the V registers will only be available
with a VM that lacks SVE, you'll never have them both simultaneously but
rather which is available at any given moment will vary on a SME without
SVE VM.  This obviously has complications, but aliasing is not one of
them.

Another option would be to represent the V registers as 128 bit Z
registers, giving you something similar to how they'd appear on a VM
with both SVE and SME for a SME only VM.

> Whether the proposed ABI is considered awkward for VMMs or not is a
> separate matter...)

Indeed.

> > > > +       max_vq.  This is the maximum vector length currently available to
> > > > +       the guest on this vcpu, and determines which register slices are
> > > > +       visible through this ioctl interface.
> > 
> > > > +       If SME is supported then the max_vq used for the Z and P registers
> > > > +       while SVCR.SM is 1 this vector length will be the maximum SME
> > > > +       vector length available for the guest, otherwise it will be the
> > > > +       maximum SVE vector length available.

> The max_vq name here is not ABI; it's just linking concepts together in
> the documentation text.

> So, can we give explicitly different names to these two max_vq values?

We could call them sve_max_vq and sme_max_vq?

> Splitting the affected register descriptions into "SVCR.SM == 0" and
> "SVCR.SM == 1" cases also be helpful to make this special-casing clear.

Possibly I'm looking at the wrong thing here but the overall text for
describing the vector registers is relatively long so I worry that it'd
be harder for readers to play spot the difference if there was
duplication.  I figured explicitly calling out the differences would be
clearer and less error prone in terms of any future updates.

> > This is attempting to say that the VL for the Z and P registers (and
> > FFR) will vary depending on if the vCPU is in streaming mode or not if
> > the maximum VL for SVE and SME differs, similarly to how the Z, P and
> > FFR registers disappear when we are not in streaming mode in a SME only
> > system.

> May flipping SVCR.SM through KVM_SET_ONE_REG have the architectural
> effect of zeroing the vector regs?  That feels like something that
> should be stated explicitly.

Yes, it should zero them - I'll find some place/way to add that.

> I'd agree that this mutating interface feels odd, and does not follow
> the original spirit of the design here.

> But the SME architecture doesn't fit well with the spirit of the
> original KVM ABI here either, so I guess there won't be a perfect
> solution.

Something's going to be awkward somewhere.

> It seems that when SME is enabled in the vCPU features and the VMM is
> planning to dump or set affected registers, there is a requirement to
> dump / set SVCR.SM first, and then go down one of two code paths.  Can
> this be called out explicitly?  This is a departure from the the
> previous interaction model, so it probably deserves its own section,
> which can then be cross-referenced from individual reg
> descriptions.

> SVCR.SM exhibits this modality w.r.t a specific set of affected
> register encodings; it would be good to have that captured clearly in
> one place.

As I said above my understanding is that this is not actually a
departure from the current stituation, this not being noticed probably
highlights why it'd be good to improve the documentation here!  I think
grouping all behaviours like this together would be good from a
usability point of view.  I don't know how much of that that fits
directly in the ABI document or in a separate "here's some gotchas" type
document, things are already getting a bit difficult to manage.
Possibly both.