[PATCH 0/5] target/arm: Support variable sized coprocessor registers

Gavin Shan posted 5 patches 2 years, 1 month ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220411065842.63880-1-gshan@redhat.com
Maintainers: Peter Maydell <peter.maydell@linaro.org>, Alexander Graf <agraf@csgraf.de>, Paolo Bonzini <pbonzini@redhat.com>
target/arm/cpu.h     | 12 +++++--
target/arm/helper.c  | 27 +++++++++-----
target/arm/hvf/hvf.c | 20 +++++++++--
target/arm/kvm.c     | 85 ++++++++++++++++++++++++++++++++++++++------
target/arm/machine.c | 30 ++++++++++++----
5 files changed, 145 insertions(+), 29 deletions(-)
[PATCH 0/5] target/arm: Support variable sized coprocessor registers
Posted by Gavin Shan 2 years, 1 month ago
There are two arrays for each CPU, to store the indexes and values of the
coprocessor registers. Currently, 8 bytes fixed storage space is reserved
for each coprocessor register. However, larger coprocessor registers have
been defined and exposed by KVM. Except SVE registers, no coprocessor
register exceeds 8 bytes in size. It doesn't mean large coprocessor registers
won't be exploited in future. For example, I'm looking into SDEI virtualization
support, which isn't merged into Linux upstream yet. I have plan to add
several coprocessor ("firmware pseudo") registers to assist the migration.

This series adds one more array, to track the position or location in the
storage array (@cpreg_values) for the corresponding coprocessor register.
The storage space for one particular coprocessor register is always to
8 bytes so that we needn't worry about the alignment issue. In this way,
the coprocessor register size can be variable.

I had some internal discussion with Eric and Drew. They suggested to
send one mail to qemu-arm@nongnu.org, asking if there is any challenges
to support variable sized coprocessor registers. So another intention
of this series is to invoke the discussion.

PATCH[1-3] adds one more array (@cpreg_value_indexes) to track the location
in the storage array (@cpreg_values) for coprocessor registers. The storage
space for one particular coprocessor register is determined by the additional
array, which is named as indirect addressing mode. Each coprocessor register
is still having 8 bytes fixed storage space, so that thd old mechanism
(direct addressing mode) and indirect address mode can co-exist, event in
migration circumstance. PATCH[4] migrates the additional array. PATCH[5]
initializes @cpreg_value_indexes for KVM.

Gavin Shan (5):
  target/arm/tcg: Indirect addressing for coprocessor register storage
  target/arm/hvf: Indirect addressing for coprocessor register storage
  target/arm/kvm: Indirect addressing for coprocessor register storage
  target/arm: Migrate coprocessor register indirect addressing
    information
  target/arm/kvm: Support coprocessor register with variable size

 target/arm/cpu.h     | 12 +++++--
 target/arm/helper.c  | 27 +++++++++-----
 target/arm/hvf/hvf.c | 20 +++++++++--
 target/arm/kvm.c     | 85 ++++++++++++++++++++++++++++++++++++++------
 target/arm/machine.c | 30 ++++++++++++----
 5 files changed, 145 insertions(+), 29 deletions(-)

-- 
2.23.0
Re: [PATCH 0/5] target/arm: Support variable sized coprocessor registers
Posted by Peter Maydell 2 years, 1 month ago
On Mon, 11 Apr 2022 at 07:59, Gavin Shan <gshan@redhat.com> wrote:
>
> There are two arrays for each CPU, to store the indexes and values of the
> coprocessor registers. Currently, 8 bytes fixed storage space is reserved
> for each coprocessor register. However, larger coprocessor registers have
> been defined and exposed by KVM. Except SVE registers, no coprocessor
> register exceeds 8 bytes in size. It doesn't mean large coprocessor registers
> won't be exploited in future. For example, I'm looking into SDEI virtualization
> support, which isn't merged into Linux upstream yet. I have plan to add
> several coprocessor ("firmware pseudo") registers to assist the migration.

So, can you give an example of coprocessor registers which are
not 8 bytes in size? How are they accessed by the guest?
If we need to support them then we need to support them, but this
cover letter/series doesn't seem to me to provide enough detail
to make the case that they really are necessary.

Also, we support SVE today, and we don't have variable size
coprocessor registers. Is there a bug here that we would be
fixing ?

thanks
-- PMM
Re: [PATCH 0/5] target/arm: Support variable sized coprocessor registers
Posted by Gavin Shan 2 years, 1 month ago
Hi Peter,

On 4/11/22 5:22 PM, Peter Maydell wrote:
> On Mon, 11 Apr 2022 at 07:59, Gavin Shan <gshan@redhat.com> wrote:
>>
>> There are two arrays for each CPU, to store the indexes and values of the
>> coprocessor registers. Currently, 8 bytes fixed storage space is reserved
>> for each coprocessor register. However, larger coprocessor registers have
>> been defined and exposed by KVM. Except SVE registers, no coprocessor
>> register exceeds 8 bytes in size. It doesn't mean large coprocessor registers
>> won't be exploited in future. For example, I'm looking into SDEI virtualization
>> support, which isn't merged into Linux upstream yet. I have plan to add
>> several coprocessor ("firmware pseudo") registers to assist the migration.
> 
> So, can you give an example of coprocessor registers which are
> not 8 bytes in size? How are they accessed by the guest?
> If we need to support them then we need to support them, but this
> cover letter/series doesn't seem to me to provide enough detail
> to make the case that they really are necessary.
> 
> Also, we support SVE today, and we don't have variable size
> coprocessor registers. Is there a bug here that we would be
> fixing ?
> 

[Cc Oliver Upon]

Apart from SVE registers, I don't think we have any more large registers
whose sizes exceed 8 bytes for now, until SDEI virtualization needs more
large registers for migration.

I'm working the KVM series to support SDEI virtualization and last revision
is v6. One of the requirement is to migrate the SDEI events and states.
In v5, the migration is done by the dedicated ioctl commands and it was
suggested by Oliver to use {GET, SET}_ONE_REG. Note that the series isn't
merged yet. So I had the prototype to support SDEI's migration through
{GET, SET}_ONE_REG. Note that those newly added registers are inaccessible
from guest.

https://github.com/gwshan/linux/commit/c2e5de5e210de6b003d1e1330eeb0958cf7007f5
(branch: kvm/arm64_sdei)

https://lore.kernel.org/lkml/20220403153911.12332-13-gshan@redhat.com/T/   (last revision: v6)
https://lore.kernel.org/kvmarm/YjtYuk+Jx1dFPQQ9@google.com/                (v5)

There are large coprocessor register sizes, like U2048, exposed by KVM.
However, it seems we never support those large coprocessor registers.
I'm not sure if we have any challenges to support those large registers,
or we don't have the needs yet?

Thanks,
Gavin
Re: [PATCH 0/5] target/arm: Support variable sized coprocessor registers
Posted by Peter Maydell 2 years, 1 month ago
On Mon, 11 Apr 2022 at 10:50, Gavin Shan <gshan@redhat.com> wrote:
> On 4/11/22 5:22 PM, Peter Maydell wrote:
> > So, can you give an example of coprocessor registers which are
> > not 8 bytes in size? How are they accessed by the guest?
> > If we need to support them then we need to support them, but this
> > cover letter/series doesn't seem to me to provide enough detail
> > to make the case that they really are necessary.
> >
> > Also, we support SVE today, and we don't have variable size
> > coprocessor registers. Is there a bug here that we would be
> > fixing ?
> >
>
> [Cc Oliver Upon]
>
> Apart from SVE registers, I don't think we have any more large registers
> whose sizes exceed 8 bytes for now, until SDEI virtualization needs more
> large registers for migration.
>
> I'm working the KVM series to support SDEI virtualization and last revision
> is v6. One of the requirement is to migrate the SDEI events and states.
> In v5, the migration is done by the dedicated ioctl commands and it was
> suggested by Oliver to use {GET, SET}_ONE_REG. Note that the series isn't
> merged yet. So I had the prototype to support SDEI's migration through
> {GET, SET}_ONE_REG. Note that those newly added registers are inaccessible
> from guest.
>
> https://github.com/gwshan/linux/commit/c2e5de5e210de6b003d1e1330eeb0958cf7007f5
> (branch: kvm/arm64_sdei)
>
> https://lore.kernel.org/lkml/20220403153911.12332-13-gshan@redhat.com/T/   (last revision: v6)
> https://lore.kernel.org/kvmarm/YjtYuk+Jx1dFPQQ9@google.com/                (v5)

Could you please describe what you're trying to do here rather
than asking me to wade through a big kernel patchset that's
adding support for a firmware interface I don't know?

> There are large coprocessor register sizes, like U2048, exposed by KVM.
> However, it seems we never support those large coprocessor registers.
> I'm not sure if we have any challenges to support those large registers,
> or we don't have the needs yet?

The general idea of the coprocessor register accessors for aarch64 KVM
is that we're giving the VMM access to the same registers that the guest
accesses via the msr/mrs instructions. Those instructions by definition
access 64 bit quantities. In a few places we've borrowed this mechanism
to define KVM-specific pseudo-registers, but that wasn't the primary
design intent. So maybe it makes sense to extend it to do what you're
trying to, or maybe that would be the tail wagging the dog. It's hard
to tell without more detail on what exactly you're trying to expose
to the VMM here.

(The ONE_REG API is used by more than just aarch64 and more than just
for coprocessor registers, which is why it supports lots of different
sizes.)

thanks
-- PMM
Re: [PATCH 0/5] target/arm: Support variable sized coprocessor registers
Posted by Gavin Shan 2 years, 1 month ago
Hi Peter,

On 4/11/22 6:05 PM, Peter Maydell wrote:
> On Mon, 11 Apr 2022 at 10:50, Gavin Shan <gshan@redhat.com> wrote:
>> On 4/11/22 5:22 PM, Peter Maydell wrote:
>>> So, can you give an example of coprocessor registers which are
>>> not 8 bytes in size? How are they accessed by the guest?
>>> If we need to support them then we need to support them, but this
>>> cover letter/series doesn't seem to me to provide enough detail
>>> to make the case that they really are necessary.
>>>
>>> Also, we support SVE today, and we don't have variable size
>>> coprocessor registers. Is there a bug here that we would be
>>> fixing ?
>>>
>>
>> [Cc Oliver Upon]
>>
>> Apart from SVE registers, I don't think we have any more large registers
>> whose sizes exceed 8 bytes for now, until SDEI virtualization needs more
>> large registers for migration.
>>
>> I'm working the KVM series to support SDEI virtualization and last revision
>> is v6. One of the requirement is to migrate the SDEI events and states.
>> In v5, the migration is done by the dedicated ioctl commands and it was
>> suggested by Oliver to use {GET, SET}_ONE_REG. Note that the series isn't
>> merged yet. So I had the prototype to support SDEI's migration through
>> {GET, SET}_ONE_REG. Note that those newly added registers are inaccessible
>> from guest.
>>
>> https://github.com/gwshan/linux/commit/c2e5de5e210de6b003d1e1330eeb0958cf7007f5
>> (branch: kvm/arm64_sdei)
>>
>> https://lore.kernel.org/lkml/20220403153911.12332-13-gshan@redhat.com/T/   (last revision: v6)
>> https://lore.kernel.org/kvmarm/YjtYuk+Jx1dFPQQ9@google.com/                (v5)
> 
> Could you please describe what you're trying to do here rather
> than asking me to wade through a big kernel patchset that's
> adding support for a firmware interface I don't know?
> 

I'm working on two serieses to support SDEI virtualization and asynchronous
page fault (Async PF) on arm64. When currently running thread is experiencing
stage-2 page fault, Async PF picks another thread for execution. After the
stage-2 page fault is resolved, that thread resumes. Async PF is driven by
notifications sent from host to guest. One of the notifications is delivered
by SDEI event. The SDEI event is something like NMI to some extent. When
the SDEI event is raised by host, it will be delivered to guest and the
previously registered (associated) event handler is invoked in guest.

One of the concerns for SDEI virtualization is how those SDEI events
and states can be migrated. Previously, I had dedicated ioctl commands
to read/write the SDEI events and states. Later, Oliver suggested to
use {GET,SET}_ONE_REG to do migration. I think it's much better than
the dedicated ioctl commands in terms of maintenance cost. In this case,
several 'firmware pseudo registers', like what PSCI is doing, will be
added and some of their sizes will exceed 8 bytes. On the other hand,
the SDEI events and states can be naturally treated as CPU's properties.
It's another reason why pseudo-registers fits the need to migrate
those SDEI events and states.

More information about SDEI can be found from its spec:

https://developer.arm.com/documentation/den0054/latest

>> There are large coprocessor register sizes, like U2048, exposed by KVM.
>> However, it seems we never support those large coprocessor registers.
>> I'm not sure if we have any challenges to support those large registers,
>> or we don't have the needs yet?
> 
> The general idea of the coprocessor register accessors for aarch64 KVM
> is that we're giving the VMM access to the same registers that the guest
> accesses via the msr/mrs instructions. Those instructions by definition
> access 64 bit quantities. In a few places we've borrowed this mechanism
> to define KVM-specific pseudo-registers, but that wasn't the primary
> design intent. So maybe it makes sense to extend it to do what you're
> trying to, or maybe that would be the tail wagging the dog. It's hard
> to tell without more detail on what exactly you're trying to expose
> to the VMM here.
> 
> (The ONE_REG API is used by more than just aarch64 and more than just
> for coprocessor registers, which is why it supports lots of different
> sizes.)
> 

Yes, we're considering 'KVM specific pseudo-registers' for migrating
SDEI events and states. Those pseudo-registers shouldn't be accessed
from guest, meaning they're only accessed by VMM (QEMU). I think
pseudo-registers fits the need very well, to migrate SDEI events
and states.

Thanks,
Gavin
Re: [PATCH 0/5] target/arm: Support variable sized coprocessor registers
Posted by Andrew Jones 2 years, 1 month ago
On Mon, Apr 11, 2022 at 10:22:59AM +0100, Peter Maydell wrote:
> On Mon, 11 Apr 2022 at 07:59, Gavin Shan <gshan@redhat.com> wrote:
> >
> > There are two arrays for each CPU, to store the indexes and values of the
> > coprocessor registers. Currently, 8 bytes fixed storage space is reserved
> > for each coprocessor register. However, larger coprocessor registers have
> > been defined and exposed by KVM. Except SVE registers, no coprocessor
> > register exceeds 8 bytes in size. It doesn't mean large coprocessor registers
> > won't be exploited in future. For example, I'm looking into SDEI virtualization
> > support, which isn't merged into Linux upstream yet. I have plan to add
> > several coprocessor ("firmware pseudo") registers to assist the migration.
> 
> So, can you give an example of coprocessor registers which are
> not 8 bytes in size? How are they accessed by the guest?
> If we need to support them then we need to support them, but this
> cover letter/series doesn't seem to me to provide enough detail
> to make the case that they really are necessary.
> 
> Also, we support SVE today, and we don't have variable size
> coprocessor registers. Is there a bug here that we would be
> fixing ?

SVE registers are KVM_REG_SIZE_U2048 and KVM_REG_SIZE_U256 sized
registers. They work fine (just like the VFP registers which are
KVM_REG_SIZE_U128 sized). They work because they don't get stored in the
cpreg list. SVE and CORE (which includes VFP) registers are filtered
out by kvm_arm_reg_syncs_via_cpreg_list(). Since they're filtered
out they need to be handled specifically by kvm_arch_get/put_registers()

I asked Gavin to check if following the SVE pattern made sense for his
use case prior to sending this series, but I don't see the rationale for
not following the SVE pattern in this cover letter. Gavin, can you please
explain why following the SVE pattern doesn't work? Or, are you trying to
avoid adding an additional special case?

Thanks,
drew
Re: [PATCH 0/5] target/arm: Support variable sized coprocessor registers
Posted by Peter Maydell 2 years, 1 month ago
On Mon, 11 Apr 2022 at 13:02, Andrew Jones <drjones@redhat.com> wrote:
>
> On Mon, Apr 11, 2022 at 10:22:59AM +0100, Peter Maydell wrote:
> > Also, we support SVE today, and we don't have variable size
> > coprocessor registers. Is there a bug here that we would be
> > fixing ?
>
> SVE registers are KVM_REG_SIZE_U2048 and KVM_REG_SIZE_U256 sized
> registers. They work fine (just like the VFP registers which are
> KVM_REG_SIZE_U128 sized). They work because they don't get stored in the
> cpreg list. SVE and CORE (which includes VFP) registers are filtered
> out by kvm_arm_reg_syncs_via_cpreg_list(). Since they're filtered
> out they need to be handled specifically by kvm_arch_get/put_registers()

Right, this is the distinction between ONE_REG registers and
coprocessor registers (which are a subset of ONE_REG registers).
We wouldn't want to handle SVE regs in the copro list anyway,
I think, because we want their state to end up in env->vfp.zregs[]
so the gdbstub can find it there. And we wouldn't have benefited
from the copro regs handling's "no need for new QEMU to handle
migrating state of a new register" because we needed a lot of
special case code for SVE and couldn't enable it by default
for other reasons.

If we do add non-64-bit cpregs on the kernel side then we need to
make those new registers opt-in, because currently deployed QEMU
will refuse to start if the kernel passes it a register in the
KVM_GET_REG_LIST that is larger than 64 bits and isn't
KVM_REG_ARM_CORE or KVM_REG_ARM64_SVE (assuming I'm not misreading
the QEMU code).

-- PMM
Re: [PATCH 0/5] target/arm: Support variable sized coprocessor registers
Posted by Gavin Shan 2 years, 1 month ago
Hi Peter,

On 4/11/22 8:10 PM, Peter Maydell wrote:
> On Mon, 11 Apr 2022 at 13:02, Andrew Jones <drjones@redhat.com> wrote:
>> On Mon, Apr 11, 2022 at 10:22:59AM +0100, Peter Maydell wrote:
>>> Also, we support SVE today, and we don't have variable size
>>> coprocessor registers. Is there a bug here that we would be
>>> fixing ?
>>
>> SVE registers are KVM_REG_SIZE_U2048 and KVM_REG_SIZE_U256 sized
>> registers. They work fine (just like the VFP registers which are
>> KVM_REG_SIZE_U128 sized). They work because they don't get stored in the
>> cpreg list. SVE and CORE (which includes VFP) registers are filtered
>> out by kvm_arm_reg_syncs_via_cpreg_list(). Since they're filtered
>> out they need to be handled specifically by kvm_arch_get/put_registers()
> 
> Right, this is the distinction between ONE_REG registers and
> coprocessor registers (which are a subset of ONE_REG registers).
> We wouldn't want to handle SVE regs in the copro list anyway,
> I think, because we want their state to end up in env->vfp.zregs[]
> so the gdbstub can find it there. And we wouldn't have benefited
> from the copro regs handling's "no need for new QEMU to handle
> migrating state of a new register" because we needed a lot of
> special case code for SVE and couldn't enable it by default
> for other reasons.
> 

Yep, those new introduced SDEI pseudo-registers, the intention
is to avoid the special case code. So the coprocessor register
list fits the need well. The only barrier to use the coprocessor
register list is the variable register sizes.

> If we do add non-64-bit cpregs on the kernel side then we need to
> make those new registers opt-in, because currently deployed QEMU
> will refuse to start if the kernel passes it a register in the
> KVM_GET_REG_LIST that is larger than 64 bits and isn't
> KVM_REG_ARM_CORE or KVM_REG_ARM64_SVE (assuming I'm not misreading
> the QEMU code).
> 

Yes, we need make those new registers opt-in absolutely. Otherwise,
the old qemu, which doesn't have variable sized registers supported,
will crash on host kernel, where large sized registers are exposed
unconditionally.

I spent some time to think of the mechanisms for opt-in. There are
two options as I can figure out: (1) Using KVM_CAP_ARM_SDEI to check
if the large sized registers exist. (2) Using newly introduced
pseudo-register ("KVM_REG_ARM_STD_BMAP") in Raghavendra's series [1].
The individual bit in this pseudo-register corresponds to one
service in "standard hypervisor" category, where SDEI falls into.

I prefer (2) because those services or firmware interfaces are
exposed in a collective way by KVM_REG_ARM_STD_BMAP, comparing
to the individual capabilities. However, they are same in essence.
Another benefit to use KVM_REG_ARM_STD_BMAP is hiding SDEI interface
and the large sized registers for old QEMU.

[1] https://lore.kernel.org/linux-arm-kernel/20220407011605.1966778-10-rananta@google.com/T/#m0bc1aa4048ca157e8e99c593b3f349b879032543

Thanks,
Gavin
Re: [PATCH 0/5] target/arm: Support variable sized coprocessor registers
Posted by Gavin Shan 2 years, 1 month ago
Hi Drew,

On 4/11/22 8:02 PM, Andrew Jones wrote:
> On Mon, Apr 11, 2022 at 10:22:59AM +0100, Peter Maydell wrote:
>> On Mon, 11 Apr 2022 at 07:59, Gavin Shan <gshan@redhat.com> wrote:
>>>
>>> There are two arrays for each CPU, to store the indexes and values of the
>>> coprocessor registers. Currently, 8 bytes fixed storage space is reserved
>>> for each coprocessor register. However, larger coprocessor registers have
>>> been defined and exposed by KVM. Except SVE registers, no coprocessor
>>> register exceeds 8 bytes in size. It doesn't mean large coprocessor registers
>>> won't be exploited in future. For example, I'm looking into SDEI virtualization
>>> support, which isn't merged into Linux upstream yet. I have plan to add
>>> several coprocessor ("firmware pseudo") registers to assist the migration.
>>
>> So, can you give an example of coprocessor registers which are
>> not 8 bytes in size? How are they accessed by the guest?
>> If we need to support them then we need to support them, but this
>> cover letter/series doesn't seem to me to provide enough detail
>> to make the case that they really are necessary.
>>
>> Also, we support SVE today, and we don't have variable size
>> coprocessor registers. Is there a bug here that we would be
>> fixing ?
> 
> SVE registers are KVM_REG_SIZE_U2048 and KVM_REG_SIZE_U256 sized
> registers. They work fine (just like the VFP registers which are
> KVM_REG_SIZE_U128 sized). They work because they don't get stored in the
> cpreg list. SVE and CORE (which includes VFP) registers are filtered
> out by kvm_arm_reg_syncs_via_cpreg_list(). Since they're filtered
> out they need to be handled specifically by kvm_arch_get/put_registers()
> 
> I asked Gavin to check if following the SVE pattern made sense for his
> use case prior to sending this series, but I don't see the rationale for
> not following the SVE pattern in this cover letter. Gavin, can you please
> explain why following the SVE pattern doesn't work? Or, are you trying to
> avoid adding an additional special case?
> 

Yes, SVE registers are special case. They're not synchronized through
coprocessor register list as you mentioned. For SDEI, we mimic PSCI
because both of them are firmware interfaces.

PSCI's pseudo-registers are synchronized and migrated through the
coprocessor register list. Besides, treating SDEI as additional
special case should work, but more maintaining load will be introduced.
We need separate functions to get/set SDEI's pseudo registers, like
what we did for SVE.

Thanks,
Gavin