[v16] VIRTIO-IOMMU device

[PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Eric Auger 5 years, 12 months ago

This series implements the QEMU virtio-iommu device.

This matches the v0.12 spec (voted) and the corresponding
virtio-iommu driver upstreamed in 5.3. All kernel dependencies
are resolved for DT integration. The virtio-iommu can be
instantiated in ARM virt using:

"-device virtio-iommu-pci".

Non DT mode is not yet supported as it has non resolved kernel
dependencies [1].

This feature targets 5.0.

Integration with vhost devices and vfio devices is not part
of this series. Please follow Bharat's respins [2].

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v4.2-virtio-iommu-v16

References:
[1] [RFC 00/13] virtio-iommu on non-devicetree platforms
[2] [PATCH RFC v5 0/5] virtio-iommu: VFIO integration

Testing:
- tested with guest using virtio-net-pci
  (,vhost=off,iommu_platform,disable-modern=off,disable-legacy=on)
  and virtio-blk-pci
- migration

History:

v15 -> v16:
- Collected Jean, Peter and Michael's R-bs
- last patches without R-b is the one related to hw/arm/virt.c
  + the last patch, added in this version
- Made the virtio-iommu-pci not hotpluggable (I dared to
  leave the R-b though)
- Renamed create_virtio_iommu into create_virtio_iommu_dt_bindings
- added entry in maintenance file

v14 -> v15:
- removed x-dt-binding and just kept check on hotplug_handler
- removed "tests: Add virtio-iommu test" as the check on
  hotplug_handler fails on PC machine
- destroy mappings in put_domain and remove
  g_tree_destroy(domain->mappings) in virtio_iommu_detach

v13 -> v14:
- added "virtio-iommu-pci: Introduce the x-dt-binding option"
- Removed the mappings gtree ref counting and simply delete
  the gtree when the last EP is detached from the domain


Eric Auger (10):
  virtio-iommu: Add skeleton
  virtio-iommu: Decode the command payload
  virtio-iommu: Implement attach/detach command
  virtio-iommu: Implement map/unmap
  virtio-iommu: Implement translate
  virtio-iommu: Implement fault reporting
  virtio-iommu: Support migration
  virtio-iommu-pci: Add virtio iommu pci support
  hw/arm/virt: Add the virtio-iommu device tree mappings
  MAINTAINERS: add virtio-iommu related files

 MAINTAINERS                      |   6 +
 hw/arm/virt.c                    |  57 +-
 hw/virtio/Kconfig                |   5 +
 hw/virtio/Makefile.objs          |   2 +
 hw/virtio/trace-events           |  20 +
 hw/virtio/virtio-iommu-pci.c     | 104 ++++
 hw/virtio/virtio-iommu.c         | 890 +++++++++++++++++++++++++++++++
 include/hw/arm/virt.h            |   2 +
 include/hw/pci/pci.h             |   1 +
 include/hw/virtio/virtio-iommu.h |  61 +++
 qdev-monitor.c                   |   1 +
 11 files changed, 1142 insertions(+), 7 deletions(-)
 create mode 100644 hw/virtio/virtio-iommu-pci.c
 create mode 100644 hw/virtio/virtio-iommu.c
 create mode 100644 include/hw/virtio/virtio-iommu.h

-- 
2.20.1

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Daniel P. Berrangé 5 years, 11 months ago

On Fri, Feb 14, 2020 at 02:27:35PM +0100, Eric Auger wrote:
> This series implements the QEMU virtio-iommu device.
> 
> This matches the v0.12 spec (voted) and the corresponding
> virtio-iommu driver upstreamed in 5.3. All kernel dependencies
> are resolved for DT integration. The virtio-iommu can be
> instantiated in ARM virt using:
> 
> "-device virtio-iommu-pci".

Is there any more documentation besides this ?

I'm wondering on the intended usage of this, and its relation
or pros/cons vs other iommu devices

You mention Arm here, but can this virtio-iommu-pci be used on
ppc64, s390x, x86_64 too ?  If so, is it a better choice than
using intel-iommu on x86_64 ?  Anything else that is relevant
for management applications to know about when using this ?


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Auger Eric 5 years, 11 months ago

Hi Daniel,

On 2/27/20 12:17 PM, Daniel P. Berrangé wrote:
> On Fri, Feb 14, 2020 at 02:27:35PM +0100, Eric Auger wrote:
>> This series implements the QEMU virtio-iommu device.
>>
>> This matches the v0.12 spec (voted) and the corresponding
>> virtio-iommu driver upstreamed in 5.3. All kernel dependencies
>> are resolved for DT integration. The virtio-iommu can be
>> instantiated in ARM virt using:
>>
>> "-device virtio-iommu-pci".
> 
> Is there any more documentation besides this ?

not yet in qemu.
> 
> I'm wondering on the intended usage of this, and its relation
> or pros/cons vs other iommu devices

Maybe if you want to catch up on the topic, looking at the very first
kernel RFC may be a good starting point. Motivation, pros & cons were
discussed in that thread (hey, April 2017!)
https://lists.linuxfoundation.org/pipermail/iommu/2017-April/021217.html

on ARM we have SMMUv3 emulation. But the VFIO integration is not
possible because SMMU does not have any "caching mode" and my nested
paging kernel series is blocked. So the only solution to integrate with
VFIO is looming virtio-iommu.

In general the pros that were put forward are: virtio-iommu is
architecture agnostic, removes the burden to accurately model complex
device states, driver can support virtualization specific optimizations
without being constrained by production driver maintenance. Cons is perf
and mem footprint if we do not consider any optimization.

You can have a look at

http://events17.linuxfoundation.org/sites/events/files/slides/viommu_arm.pdf

> 
> You mention Arm here, but can this virtio-iommu-pci be used on
> ppc64, s390x, x86_64 too ? 

Not Yet. At the moment we are stuck with the non DT integration at
kernel level. We can instantiate the device in machvirt with DT boot only.

Work is ongoing on kernel, by Jean-Philippe to support non DT integration:

[1] [PATCH 0/3] virtio-iommu on non-devicetree platforms
(https://www.spinics.net/lists/linux-virtualization/msg41391.html)

This does nor rely on ACPI anymore.

Originally the plan was to integrate with ACPI (IORT) but Michael pushed
to pass the binding info between the protected devices and the IOMMU
through the PCI cfg space. Also this could serve environments where we
do not have ACPI. I think some people are reluctant to expose the
virtio-iommu in the [IORT] ACPI table.

But definitively the end goal is to support the virtio-iommu for other
archs. Integration with x86 is already working based on IORT or [1].

 If so, is it a better choice than
> using intel-iommu on x86_64?
Anything else that is relevant
> for management applications to know about when using this ?

I think We are still at the early stage and this would be premature even
if feasible.

Hope it helps

Thanks

Eric
> 
> 
> Regards,
> Daniel
>

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Zhangfei Gao 5 years, 11 months ago

Hi Eric

On Thu, Feb 27, 2020 at 9:50 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Daniel,
>
> On 2/27/20 12:17 PM, Daniel P. Berrangé wrote:
> > On Fri, Feb 14, 2020 at 02:27:35PM +0100, Eric Auger wrote:
> >> This series implements the QEMU virtio-iommu device.
> >>
> >> This matches the v0.12 spec (voted) and the corresponding
> >> virtio-iommu driver upstreamed in 5.3. All kernel dependencies
> >> are resolved for DT integration. The virtio-iommu can be
> >> instantiated in ARM virt using:
> >>
> >> "-device virtio-iommu-pci".
> >
> > Is there any more documentation besides this ?
>
> not yet in qemu.
> >
> > I'm wondering on the intended usage of this, and its relation
> > or pros/cons vs other iommu devices
>
> Maybe if you want to catch up on the topic, looking at the very first
> kernel RFC may be a good starting point. Motivation, pros & cons were
> discussed in that thread (hey, April 2017!)
> https://lists.linuxfoundation.org/pipermail/iommu/2017-April/021217.html
>
> on ARM we have SMMUv3 emulation. But the VFIO integration is not
> possible because SMMU does not have any "caching mode" and my nested
> paging kernel series is blocked. So the only solution to integrate with
> VFIO is looming virtio-iommu.
>
> In general the pros that were put forward are: virtio-iommu is
> architecture agnostic, removes the burden to accurately model complex
> device states, driver can support virtualization specific optimizations
> without being constrained by production driver maintenance. Cons is perf
> and mem footprint if we do not consider any optimization.
>
> You can have a look at
>
> http://events17.linuxfoundation.org/sites/events/files/slides/viommu_arm.pdf
>
Thanks for the patches.

Could I ask one question?
To support vSVA and pasid in guest, which direction you recommend,
virtio-iommu or vSMMU (your nested paging)

Do we still have any obstacles?
Would you mind give some breakdown.
Jean mentioned PASID still not supported in QEMU.

Thanks

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Auger Eric 5 years, 11 months ago

Hi Zhangfei,
On 3/3/20 4:23 AM, Zhangfei Gao wrote:
> Hi Eric
> 
> On Thu, Feb 27, 2020 at 9:50 PM Auger Eric <eric.auger@redhat.com> wrote:
>>
>> Hi Daniel,
>>
>> On 2/27/20 12:17 PM, Daniel P. Berrangé wrote:
>>> On Fri, Feb 14, 2020 at 02:27:35PM +0100, Eric Auger wrote:
>>>> This series implements the QEMU virtio-iommu device.
>>>>
>>>> This matches the v0.12 spec (voted) and the corresponding
>>>> virtio-iommu driver upstreamed in 5.3. All kernel dependencies
>>>> are resolved for DT integration. The virtio-iommu can be
>>>> instantiated in ARM virt using:
>>>>
>>>> "-device virtio-iommu-pci".
>>>
>>> Is there any more documentation besides this ?
>>
>> not yet in qemu.
>>>
>>> I'm wondering on the intended usage of this, and its relation
>>> or pros/cons vs other iommu devices
>>
>> Maybe if you want to catch up on the topic, looking at the very first
>> kernel RFC may be a good starting point. Motivation, pros & cons were
>> discussed in that thread (hey, April 2017!)
>> https://lists.linuxfoundation.org/pipermail/iommu/2017-April/021217.html
>>
>> on ARM we have SMMUv3 emulation. But the VFIO integration is not
>> possible because SMMU does not have any "caching mode" and my nested
>> paging kernel series is blocked. So the only solution to integrate with
>> VFIO is looming virtio-iommu.
>>
>> In general the pros that were put forward are: virtio-iommu is
>> architecture agnostic, removes the burden to accurately model complex
>> device states, driver can support virtualization specific optimizations
>> without being constrained by production driver maintenance. Cons is perf
>> and mem footprint if we do not consider any optimization.
>>
>> You can have a look at
>>
>> http://events17.linuxfoundation.org/sites/events/files/slides/viommu_arm.pdf
>>
> Thanks for the patches.
> 
> Could I ask one question?
> To support vSVA and pasid in guest, which direction you recommend,
> virtio-iommu or vSMMU (your nested paging)
> 
> Do we still have any obstacles?
you can ask the question but not sure I can answer ;-)

1) SMMUv3 2stage implementation is blocked by Will at kernel level.

Despite this situation I may/can respin as Marvell said they were
interested in this effort. If you are also interested in (I know you
tested it several times and I am grateful to you for that), please reply
to:
[PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
(https://patchwork.kernel.org/cover/11039871/) and say you are
interested in that work so that maintainers are aware there are
potential users.

At the moment I have not supported multiple CDs because it introduced
other dependencies.

2) virtio-iommu

So only virtio-iommu dt boot on machvirt is currently supported. For non
DT, Jean respinned his kernel series
"[PATCH v2 0/3] virtio-iommu on x86 and non-devicetree platforms" as you
may know. However non DT integration still is controversial. Michael is
pushing for putting the binding info the PCI config space. Joerg
yesterday challenged this solution and said he would prefer ACPI
integration. ACPI support depends on ACPI spec update & vote anyway.

To support PASID at virtio-iommu level you also need virtio-iommu API
extensions to be proposed and written + kernel works. So that's a long
road. I will let Jean-Philippe comment on that.

I would just say that Intel is working on nested paging solution with
their emulated intel-iommu. We can help them getting that upstream and
partly benefit from this work.

> Would you mind give some breakdown.
> Jean mentioned PASID still not supported in QEMU.
Do you mean support of multiple CDs in the emulated SMMU? That's a thing
I could implement quite easily. What is more tricky is how to test it.

Thanks

Eric
> 
> Thanks
>

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Zhangfei Gao 5 years, 11 months ago

On Tue, Mar 3, 2020 at 5:41 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Zhangfei,
> On 3/3/20 4:23 AM, Zhangfei Gao wrote:
> > Hi Eric
> >
> > On Thu, Feb 27, 2020 at 9:50 PM Auger Eric <eric.auger@redhat.com> wrote:
> >>
> >> Hi Daniel,
> >>
> >> On 2/27/20 12:17 PM, Daniel P. Berrangé wrote:
> >>> On Fri, Feb 14, 2020 at 02:27:35PM +0100, Eric Auger wrote:
> >>>> This series implements the QEMU virtio-iommu device.
> >>>>
> >>>> This matches the v0.12 spec (voted) and the corresponding
> >>>> virtio-iommu driver upstreamed in 5.3. All kernel dependencies
> >>>> are resolved for DT integration. The virtio-iommu can be
> >>>> instantiated in ARM virt using:
> >>>>
> >>>> "-device virtio-iommu-pci".
> >>>
> >>> Is there any more documentation besides this ?
> >>
> >> not yet in qemu.
> >>>
> >>> I'm wondering on the intended usage of this, and its relation
> >>> or pros/cons vs other iommu devices
> >>
> >> Maybe if you want to catch up on the topic, looking at the very first
> >> kernel RFC may be a good starting point. Motivation, pros & cons were
> >> discussed in that thread (hey, April 2017!)
> >> https://lists.linuxfoundation.org/pipermail/iommu/2017-April/021217.html
> >>
> >> on ARM we have SMMUv3 emulation. But the VFIO integration is not
> >> possible because SMMU does not have any "caching mode" and my nested
> >> paging kernel series is blocked. So the only solution to integrate with
> >> VFIO is looming virtio-iommu.
> >>
> >> In general the pros that were put forward are: virtio-iommu is
> >> architecture agnostic, removes the burden to accurately model complex
> >> device states, driver can support virtualization specific optimizations
> >> without being constrained by production driver maintenance. Cons is perf
> >> and mem footprint if we do not consider any optimization.
> >>
> >> You can have a look at
> >>
> >> http://events17.linuxfoundation.org/sites/events/files/slides/viommu_arm.pdf
> >>
> > Thanks for the patches.
> >
> > Could I ask one question?
> > To support vSVA and pasid in guest, which direction you recommend,
> > virtio-iommu or vSMMU (your nested paging)
> >
> > Do we still have any obstacles?
> you can ask the question but not sure I can answer ;-)
>
> 1) SMMUv3 2stage implementation is blocked by Will at kernel level.
>
> Despite this situation I may/can respin as Marvell said they were
> interested in this effort. If you are also interested in (I know you
> tested it several times and I am grateful to you for that), please reply
> to:
> [PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
> (https://patchwork.kernel.org/cover/11039871/) and say you are
> interested in that work so that maintainers are aware there are
> potential users.
>
> At the moment I have not supported multiple CDs because it introduced
> other dependencies.
>
> 2) virtio-iommu
>
> So only virtio-iommu dt boot on machvirt is currently supported. For non
> DT, Jean respinned his kernel series
> "[PATCH v2 0/3] virtio-iommu on x86 and non-devicetree platforms" as you
> may know. However non DT integration still is controversial. Michael is
> pushing for putting the binding info the PCI config space. Joerg
> yesterday challenged this solution and said he would prefer ACPI
> integration. ACPI support depends on ACPI spec update & vote anyway.
>
> To support PASID at virtio-iommu level you also need virtio-iommu API
> extensions to be proposed and written + kernel works. So that's a long
> road. I will let Jean-Philippe comment on that.
>
> I would just say that Intel is working on nested paging solution with
> their emulated intel-iommu. We can help them getting that upstream and
> partly benefit from this work.
>
> > Would you mind give some breakdown.
> > Jean mentioned PASID still not supported in QEMU.
> Do you mean support of multiple CDs in the emulated SMMU? That's a thing
> I could implement quite easily. What is more tricky is how to test it.

Thanks Eric

Discussed with Jean before, here are some obstacles for vSVA via nested paging.
Do you think they are still big issues?

Copy "
* PASID support in QEMU, I don't think there is anything yet
// this is not a big issue as your comments.

* Page response support in VFIO and QEMU. With Eric's series we can
inject recoverable faults into the guest, but there is no channel for
the guest to RESUME the stall after fixing it.

* We can't use DVM in nested mode unless the VMID is shared with the
CPU. For that we'll need the host SMMU driver to hook into the KVM VMID
allocator, just like we do for the ASID allocator. I haven't yet
investigated how to do that. It's possible to do vSVA without DVM
though, by sending all TLB invalidations through the SMMU command queue.
"

Thanks

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Auger Eric 5 years, 11 months ago

Hi Zhangfei,

On 3/4/20 7:08 AM, Zhangfei Gao wrote:
> On Tue, Mar 3, 2020 at 5:41 PM Auger Eric <eric.auger@redhat.com> wrote:
>>
>> Hi Zhangfei,
>> On 3/3/20 4:23 AM, Zhangfei Gao wrote:
>>> Hi Eric
>>>
>>> On Thu, Feb 27, 2020 at 9:50 PM Auger Eric <eric.auger@redhat.com> wrote:
>>>>
>>>> Hi Daniel,
>>>>
>>>> On 2/27/20 12:17 PM, Daniel P. Berrangé wrote:
>>>>> On Fri, Feb 14, 2020 at 02:27:35PM +0100, Eric Auger wrote:
>>>>>> This series implements the QEMU virtio-iommu device.
>>>>>>
>>>>>> This matches the v0.12 spec (voted) and the corresponding
>>>>>> virtio-iommu driver upstreamed in 5.3. All kernel dependencies
>>>>>> are resolved for DT integration. The virtio-iommu can be
>>>>>> instantiated in ARM virt using:
>>>>>>
>>>>>> "-device virtio-iommu-pci".
>>>>>
>>>>> Is there any more documentation besides this ?
>>>>
>>>> not yet in qemu.
>>>>>
>>>>> I'm wondering on the intended usage of this, and its relation
>>>>> or pros/cons vs other iommu devices
>>>>
>>>> Maybe if you want to catch up on the topic, looking at the very first
>>>> kernel RFC may be a good starting point. Motivation, pros & cons were
>>>> discussed in that thread (hey, April 2017!)
>>>> https://lists.linuxfoundation.org/pipermail/iommu/2017-April/021217.html
>>>>
>>>> on ARM we have SMMUv3 emulation. But the VFIO integration is not
>>>> possible because SMMU does not have any "caching mode" and my nested
>>>> paging kernel series is blocked. So the only solution to integrate with
>>>> VFIO is looming virtio-iommu.
>>>>
>>>> In general the pros that were put forward are: virtio-iommu is
>>>> architecture agnostic, removes the burden to accurately model complex
>>>> device states, driver can support virtualization specific optimizations
>>>> without being constrained by production driver maintenance. Cons is perf
>>>> and mem footprint if we do not consider any optimization.
>>>>
>>>> You can have a look at
>>>>
>>>> http://events17.linuxfoundation.org/sites/events/files/slides/viommu_arm.pdf
>>>>
>>> Thanks for the patches.
>>>
>>> Could I ask one question?
>>> To support vSVA and pasid in guest, which direction you recommend,
>>> virtio-iommu or vSMMU (your nested paging)
>>>
>>> Do we still have any obstacles?
>> you can ask the question but not sure I can answer ;-)
>>
>> 1) SMMUv3 2stage implementation is blocked by Will at kernel level.
>>
>> Despite this situation I may/can respin as Marvell said they were
>> interested in this effort. If you are also interested in (I know you
>> tested it several times and I am grateful to you for that), please reply
>> to:
>> [PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
>> (https://patchwork.kernel.org/cover/11039871/) and say you are
>> interested in that work so that maintainers are aware there are
>> potential users.
>>
>> At the moment I have not supported multiple CDs because it introduced
>> other dependencies.
>>
>> 2) virtio-iommu
>>
>> So only virtio-iommu dt boot on machvirt is currently supported. For non
>> DT, Jean respinned his kernel series
>> "[PATCH v2 0/3] virtio-iommu on x86 and non-devicetree platforms" as you
>> may know. However non DT integration still is controversial. Michael is
>> pushing for putting the binding info the PCI config space. Joerg
>> yesterday challenged this solution and said he would prefer ACPI
>> integration. ACPI support depends on ACPI spec update & vote anyway.
>>
>> To support PASID at virtio-iommu level you also need virtio-iommu API
>> extensions to be proposed and written + kernel works. So that's a long
>> road. I will let Jean-Philippe comment on that.
>>
>> I would just say that Intel is working on nested paging solution with
>> their emulated intel-iommu. We can help them getting that upstream and
>> partly benefit from this work.
>>
>>> Would you mind give some breakdown.
>>> Jean mentioned PASID still not supported in QEMU.
>> Do you mean support of multiple CDs in the emulated SMMU? That's a thing
>> I could implement quite easily. What is more tricky is how to test it.
> 
> Thanks Eric
> 
> Discussed with Jean before, here are some obstacles for vSVA via nested paging.
> Do you think they are still big issues?
> 
> Copy "
> * PASID support in QEMU, I don't think there is anything yet
> // this is not a big issue as your comments.
> 
> * Page response support in VFIO and QEMU. With Eric's series we can
> inject recoverable faults into the guest, but there is no channel for
> the guest to RESUME the stall after fixing it.
I guess this matches a command sent through the SMMUv3 command queue
(CMD_PRI_RESP) that should be trapped by QEMU and injected to the
physical SMMU, right?

I think everybody misses that injection path and that's not specific to
virtio-iommu. PRS is not currently addressed by in-flight Intel's kernel
series ([PATCH V9 00/10] Nested Shared Virtual Address (SVA) VT-d
support) either.

I think the topic is complex enough to separate the concerns and try to
move forward in incremental steps hence my efforts to push for simple
nested use case. Can't you support vSVA without PRS first (I think this
Intel's strategy too)
> 
> * We can't use DVM in nested mode unless the VMID is shared with the
> CPU. For that we'll need the host SMMU driver to hook into the KVM VMID
> allocator, just like we do for the ASID allocator. I haven't yet
> investigated how to do that. It's possible to do vSVA without DVM
> though, by sending all TLB invalidations through the SMMU command queue.
> "
OK.

From the above arguments I am not sure there are technical blockers with
nested paging implementation. For sure there are things that are not
supported, because I did not address this topic yet.

If I were to work on this, you did not answer bout the testing feasibility.

Thanks

Eric
> 
> Thanks
>

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Jean-Philippe Brucker 5 years, 11 months ago

On Wed, Mar 04, 2020 at 09:41:44AM +0100, Auger Eric wrote:
> >>> Could I ask one question?
> >>> To support vSVA and pasid in guest, which direction you recommend,
> >>> virtio-iommu or vSMMU (your nested paging)
> >>>
> >>> Do we still have any obstacles?
> >> you can ask the question but not sure I can answer ;-)
> >>
> >> 1) SMMUv3 2stage implementation is blocked by Will at kernel level.
> >>
> >> Despite this situation I may/can respin as Marvell said they were
> >> interested in this effort.

Do you know if they want vSVA as well or only nested translation?

> >> If you are also interested in (I know you
> >> tested it several times and I am grateful to you for that), please reply
> >> to:
> >> [PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
> >> (https://patchwork.kernel.org/cover/11039871/) and say you are
> >> interested in that work so that maintainers are aware there are
> >> potential users.
> >>
> >> At the moment I have not supported multiple CDs because it introduced
> >> other dependencies.
> >>
> >> 2) virtio-iommu
> >>
> >> So only virtio-iommu dt boot on machvirt is currently supported. For non
> >> DT, Jean respinned his kernel series
> >> "[PATCH v2 0/3] virtio-iommu on x86 and non-devicetree platforms" as you
> >> may know. However non DT integration still is controversial. Michael is
> >> pushing for putting the binding info the PCI config space. Joerg
> >> yesterday challenged this solution and said he would prefer ACPI
> >> integration. ACPI support depends on ACPI spec update & vote anyway.
> >>
> >> To support PASID at virtio-iommu level you also need virtio-iommu API
> >> extensions to be proposed and written + kernel works. So that's a long
> >> road. I will let Jean-Philippe comment on that.

Yeah, let's put that stuff on hold. vSVA with virtio-iommu requires about
the same amount of work in the host kernel as vSMMU, minus the NESTED_MSI
stuff. The device implementation would be simpler, but the guest driver is
difficult (I'd need to extract the CD table code from the SMMU driver
again). And obtaining better performance than vSMMU would then require
upstreaming vhost-iommu. I do have incomplete drafts and prototypes but
I'll put them aside until users (other than hardware validation) show up
and actually need performance or things like unpinned stage-2.

> >> I would just say that Intel is working on nested paging solution with
> >> their emulated intel-iommu. We can help them getting that upstream and
> >> partly benefit from this work.
> >>
> >>> Would you mind give some breakdown.
> >>> Jean mentioned PASID still not supported in QEMU.
> >> Do you mean support of multiple CDs in the emulated SMMU? That's a thing
> >> I could implement quite easily. What is more tricky is how to test it.
> > 
> > Thanks Eric
> > 
> > Discussed with Jean before, here are some obstacles for vSVA via nested paging.
> > Do you think they are still big issues?
> > 
> > Copy "
> > * PASID support in QEMU, I don't think there is anything yet
> > // this is not a big issue as your comments.
> > 
> > * Page response support in VFIO and QEMU. With Eric's series we can
> > inject recoverable faults into the guest, but there is no channel for
> > the guest to RESUME the stall after fixing it.
> I guess this matches a command sent through the SMMUv3 command queue
> (CMD_PRI_RESP) that should be trapped by QEMU and injected to the
> physical SMMU, right?
> 
> I think everybody misses that injection path and that's not specific to
> virtio-iommu. PRS is not currently addressed by in-flight Intel's kernel
> series ([PATCH V9 00/10] Nested Shared Virtual Address (SVA) VT-d
> support) either.
> 
> I think the topic is complex enough to separate the concerns and try to
> move forward in incremental steps hence my efforts to push for simple
> nested use case. Can't you support vSVA without PRS first (I think this
> Intel's strategy too)

Not really, for sharing guest process address spaces you need I/O page
faults. You can test PASID alone without PRI by using auxiliary domains in
the guest, so I'd advise to start with that, but it requires modifications
to the device driver.

> > 
> > * We can't use DVM in nested mode unless the VMID is shared with the
> > CPU. For that we'll need the host SMMU driver to hook into the KVM VMID
> > allocator, just like we do for the ASID allocator. I haven't yet
> > investigated how to do that. It's possible to do vSVA without DVM
> > though, by sending all TLB invalidations through the SMMU command queue.
> > "

Hm we're already mandating DVM for host SVA, so I'd say mandate it for
vSVA as well. We'd avoid a ton of context switches, especially for the zip
accelerator which doesn't require ATC invalidations. The host needs to pin
the VMID allocated by KVM and write it in the endpoint's STE.

Thanks,
Jean

> OK.
> 
> From the above arguments I am not sure there are technical blockers with
> nested paging implementation. For sure there are things that are not
> supported, because I did not address this topic yet.
> 
> If I were to work on this, you did not answer bout the testing feasibility.
> 
> Thanks
> 
> Eric
> > 
> > Thanks
> > 
>

RE: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Tian, Kevin 5 years, 11 months ago

> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Sent: Thursday, March 5, 2020 12:47 AM
>
[...]
> > >
> > > * We can't use DVM in nested mode unless the VMID is shared with the
> > > CPU. For that we'll need the host SMMU driver to hook into the KVM
> VMID
> > > allocator, just like we do for the ASID allocator. I haven't yet
> > > investigated how to do that. It's possible to do vSVA without DVM
> > > though, by sending all TLB invalidations through the SMMU command
> queue.
> > > "
> 
> Hm we're already mandating DVM for host SVA, so I'd say mandate it for
> vSVA as well. We'd avoid a ton of context switches, especially for the zip
> accelerator which doesn't require ATC invalidations. The host needs to pin
> the VMID allocated by KVM and write it in the endpoint's STE.
> 

Curious... what is DVM and how is it related to SVA? Is it SMMU specific?

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Jean-Philippe Brucker 5 years, 11 months ago

On Thu, Mar 05, 2020 at 02:56:20AM +0000, Tian, Kevin wrote:
> > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Sent: Thursday, March 5, 2020 12:47 AM
> >
> [...]
> > > >
> > > > * We can't use DVM in nested mode unless the VMID is shared with the
> > > > CPU. For that we'll need the host SMMU driver to hook into the KVM
> > VMID
> > > > allocator, just like we do for the ASID allocator. I haven't yet
> > > > investigated how to do that. It's possible to do vSVA without DVM
> > > > though, by sending all TLB invalidations through the SMMU command
> > queue.
> > > > "
> > 
> > Hm we're already mandating DVM for host SVA, so I'd say mandate it for
> > vSVA as well. We'd avoid a ton of context switches, especially for the zip
> > accelerator which doesn't require ATC invalidations. The host needs to pin
> > the VMID allocated by KVM and write it in the endpoint's STE.
> > 
> 
> Curious... what is DVM and how is it related to SVA? Is it SMMU specific?

Yes it stands for "Distributed Virtual Memory", an Arm interconnect
protocol. When sharing a process address space, TLB invalidations from the
CPU are broadcasted to the SMMU, so we don't have to send commands through
the SMMU queue to invalidate IOTLBs. However ATCs from PCIe endpoints do
not participate in DVM and still have to be invalidated by hand.

Thanks,
Jean

RE: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Tian, Kevin 5 years, 11 months ago

> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Sent: Thursday, March 5, 2020 3:34 PM
> 
> On Thu, Mar 05, 2020 at 02:56:20AM +0000, Tian, Kevin wrote:
> > > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > > Sent: Thursday, March 5, 2020 12:47 AM
> > >
> > [...]
> > > > >
> > > > > * We can't use DVM in nested mode unless the VMID is shared with
> the
> > > > > CPU. For that we'll need the host SMMU driver to hook into the KVM
> > > VMID
> > > > > allocator, just like we do for the ASID allocator. I haven't yet
> > > > > investigated how to do that. It's possible to do vSVA without DVM
> > > > > though, by sending all TLB invalidations through the SMMU command
> > > queue.
> > > > > "
> > >
> > > Hm we're already mandating DVM for host SVA, so I'd say mandate it for
> > > vSVA as well. We'd avoid a ton of context switches, especially for the zip
> > > accelerator which doesn't require ATC invalidations. The host needs to
> pin
> > > the VMID allocated by KVM and write it in the endpoint's STE.
> > >
> >
> > Curious... what is DVM and how is it related to SVA? Is it SMMU specific?
> 
> Yes it stands for "Distributed Virtual Memory", an Arm interconnect
> protocol. When sharing a process address space, TLB invalidations from the
> CPU are broadcasted to the SMMU, so we don't have to send commands
> through
> the SMMU queue to invalidate IOTLBs. However ATCs from PCIe endpoints
> do
> not participate in DVM and still have to be invalidated by hand.
> 

ah, got it. Thanks for explanation!

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Peter Maydell 5 years, 11 months ago

On Fri, 14 Feb 2020 at 13:28, Eric Auger <eric.auger@redhat.com> wrote:
>
> This series implements the QEMU virtio-iommu device.
>
> This matches the v0.12 spec (voted) and the corresponding
> virtio-iommu driver upstreamed in 5.3. All kernel dependencies
> are resolved for DT integration. The virtio-iommu can be
> instantiated in ARM virt using:
>
> "-device virtio-iommu-pci".
>
> Non DT mode is not yet supported as it has non resolved kernel
> dependencies [1].
>
> This feature targets 5.0.
>
> Integration with vhost devices and vfio devices is not part
> of this series. Please follow Bharat's respins [2].

I think everything here has reviewed-by tags now -- does
anybody still want more time to review it, and what
is the preference for how it goes into master?

thanks
-- PMM

Re: [PATCH v16 00/10] VIRTIO-IOMMU device

Posted by Michael S. Tsirkin 5 years, 11 months ago

On Fri, Feb 21, 2020 at 02:27:30PM +0000, Peter Maydell wrote:
> On Fri, 14 Feb 2020 at 13:28, Eric Auger <eric.auger@redhat.com> wrote:
> >
> > This series implements the QEMU virtio-iommu device.
> >
> > This matches the v0.12 spec (voted) and the corresponding
> > virtio-iommu driver upstreamed in 5.3. All kernel dependencies
> > are resolved for DT integration. The virtio-iommu can be
> > instantiated in ARM virt using:
> >
> > "-device virtio-iommu-pci".
> >
> > Non DT mode is not yet supported as it has non resolved kernel
> > dependencies [1].
> >
> > This feature targets 5.0.
> >
> > Integration with vhost devices and vfio devices is not part
> > of this series. Please follow Bharat's respins [2].
> 
> I think everything here has reviewed-by tags now -- does
> anybody still want more time to review it, and what
> is the preference for how it goes into master?
> 
> thanks
> -- PMM

I guess I'll pick it up, most code seems to be virtio related.

Thanks everyone!

-- 
MST