hw/i386/intel_iommu.c | 540 ++++++++++++++++++++++++++++++++++------- hw/i386/intel_iommu_internal.h | 54 ++++- hw/i386/trace-events | 2 +- include/hw/i386/intel_iommu.h | 28 ++- 4 files changed, 534 insertions(+), 90 deletions(-)
Intel vt-d rev3.0 [1] introduces a new translation mode called
'scalable mode', which enables PASID-granular translations for
first level, second level, nested and pass-through modes. The
vt-d scalable mode is the key ingredient to enable Scalable I/O
Virtualization (Scalable IOV) [2] [3], which allows sharing a
device in minimal possible granularity (ADI - Assignable Device
Interface). As a result, previous Extended Context (ECS) mode
is deprecated (no production ever implements ECS).
This patch set emulates a minimal capability set of VT-d scalable
mode, equivalent to what is available in VT-d legacy mode today:
1. Scalable mode root entry, context entry and PASID table
2. Seconds level translation under scalable mode
3. Queued invalidation (with 256 bits descriptor)
4. Pass-through mode
Corresponding intel-iommu driver support will be included in
kernel 5.0:
https://www.spinics.net/lists/kernel/msg2985279.html
We will add emulation of full scalable mode capability along with
guest iommu driver progress later, e.g.:
1. First level translation
2. Nested translation
3. Per-PASID invalidation descriptors
4. Page request services for handling recoverable faults
To verify the patches, below cases were tested according to Peter Xu's
suggestions.
+---------+----------------------------------------------------------------+----------------------------------------------------------------+
| | w/ Device Passthr | w/o Device Passthr |
| +-------------------------------+--------------------------------+-------------------------------+--------------------------------+
| | virtio-net-pci, vhost=on | virtio-net-pci, vhost=off | virtio-net-pci, vhost=on | virtio-net-pci, vhost=off |
| +-------------------------------+--------------------------------+-------------------------------+--------------------------------+
| | netperf | kernel bld | data cp| netperf | kernel bld | data cp | netperf | kernel bld | data cp| netperf | kernel bld | data cp |
+---------+-------------------------------+--------------------------------+-------------------------------+--------------------------------+
| Legacy | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
+---------+-------------------------------+--------------------------------+-------------------------------+--------------------------------+
| Scalable| Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
+---------+-------------------------------+--------------------------------+-------------------------------+--------------------------------+
References:
[1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
---
v1->v2
Patch 1:
- remove unnecessary macros.
- rename macros to capital.
- make 're->hi' assignment be unconditional to simplify codes.
- remove 'vtd_get_context_base' to embed its content into caller.
- remove 'vtd_context_entry_format' to to embed its content into caller.
- remove unnecessary memset for 'pe->val'.
- use 'INTEL_IOMMU_DEVICE' to get 'IntelIOMMUState' to remove input
'IntelIOMMUState *s' parameter.
- call 'vtd_get_domain_id' to get domain_id.
- check error code returned by 'vtd_ce_get_rid2pasid_entry' in
'vtd_dev_pt_enabled'.
- check '!is_fpd_set' of context entry before handing pasid entry.
- move 's->root_scalable' assignment to patch 3.
- add comment for 'VTD_FR_PASID_TABLE_INV'.
- remove not used 'VTD_ROOT_ENTRY_SIZE'.
- change 'VTD_CTX_ENTRY_LECY_SIZE' to 'VTD_CTX_ENTRY_LEGACY_SIZE'.
- change 'VTD_CTX_ENTRY_SM_SIZE' to 'VTD_CTX_ENTRY_SCALABLE_SIZE'.
- use union in 'struct VTDContextEntry' to reduce code changes.
Patch 2:
- modify s-o-b position.
- remove unnecessary macros.
- change 'iq_dw' type to bool.
- remove initialization to 'inv_desc->val[]'.
- modify 'VTDInvDesc' to add a union 'val[4]' to be compatible
with both legacy mode and scalable mode.
Patch 3:
- rename "scalable-mode" to "x-scalable-mode".
- remove caching_mode check when scalable_mode is set.
- check dma_drain check when scalable_mode is set. This is requested
by spec.
- remove redundant macros.
---
Liu, Yi L (2):
intel_iommu: scalable mode emulation
intel_iommu: add 256 bits qi_desc support
Yi Sun (1):
intel_iommu: add scalable-mode option to make scalable mode work
hw/i386/intel_iommu.c | 540 ++++++++++++++++++++++++++++++++++-------
hw/i386/intel_iommu_internal.h | 54 ++++-
hw/i386/trace-events | 2 +-
include/hw/i386/intel_iommu.h | 28 ++-
4 files changed, 534 insertions(+), 90 deletions(-)
--
1.9.1
On Thu, Feb 28, 2019 at 09:47:54PM +0800, Yi Sun wrote: > Intel vt-d rev3.0 [1] introduces a new translation mode called > 'scalable mode', which enables PASID-granular translations for > first level, second level, nested and pass-through modes. The > vt-d scalable mode is the key ingredient to enable Scalable I/O > Virtualization (Scalable IOV) [2] [3], which allows sharing a > device in minimal possible granularity (ADI - Assignable Device > Interface). As a result, previous Extended Context (ECS) mode > is deprecated (no production ever implements ECS). > > This patch set emulates a minimal capability set of VT-d scalable > mode, equivalent to what is available in VT-d legacy mode today: > 1. Scalable mode root entry, context entry and PASID table > 2. Seconds level translation under scalable mode > 3. Queued invalidation (with 256 bits descriptor) > 4. Pass-through mode > > Corresponding intel-iommu driver support will be included in > kernel 5.0: > https://www.spinics.net/lists/kernel/msg2985279.html > > We will add emulation of full scalable mode capability along with > guest iommu driver progress later, e.g.: > 1. First level translation > 2. Nested translation > 3. Per-PASID invalidation descriptors > 4. Page request services for handling recoverable faults > > To verify the patches, below cases were tested according to Peter Xu's > suggestions. > +---------+----------------------------------------------------------------+----------------------------------------------------------------+ > | | w/ Device Passthr | w/o Device Passthr | > | +-------------------------------+--------------------------------+-------------------------------+--------------------------------+ > | | virtio-net-pci, vhost=on | virtio-net-pci, vhost=off | virtio-net-pci, vhost=on | virtio-net-pci, vhost=off | > | +-------------------------------+--------------------------------+-------------------------------+--------------------------------+ > | | netperf | kernel bld | data cp| netperf | kernel bld | data cp | netperf | kernel bld | data cp| netperf | kernel bld | data cp | > +---------+-------------------------------+--------------------------------+-------------------------------+--------------------------------+ > | Legacy | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | > +---------+-------------------------------+--------------------------------+-------------------------------+--------------------------------+ > | Scalable| Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | > +---------+-------------------------------+--------------------------------+-------------------------------+--------------------------------+ Hi, Yi, Thanks very much for the thorough test matrix! The last thing I'd like to confirm is have you tested device assignment with v2? And note that when you test with virtio devices you should not need caching-mode=on (but caching-mode=on should not break anyone though). I've still got some comments here and there but it looks very good at least to me overall. Thanks, -- Peter Xu
On 19-03-01 15:07:34, Peter Xu wrote: > On Thu, Feb 28, 2019 at 09:47:54PM +0800, Yi Sun wrote: > > Intel vt-d rev3.0 [1] introduces a new translation mode called > > 'scalable mode', which enables PASID-granular translations for > > first level, second level, nested and pass-through modes. The > > vt-d scalable mode is the key ingredient to enable Scalable I/O > > Virtualization (Scalable IOV) [2] [3], which allows sharing a > > device in minimal possible granularity (ADI - Assignable Device > > Interface). As a result, previous Extended Context (ECS) mode > > is deprecated (no production ever implements ECS). > > > > This patch set emulates a minimal capability set of VT-d scalable > > mode, equivalent to what is available in VT-d legacy mode today: > > 1. Scalable mode root entry, context entry and PASID table > > 2. Seconds level translation under scalable mode > > 3. Queued invalidation (with 256 bits descriptor) > > 4. Pass-through mode > > > > Corresponding intel-iommu driver support will be included in > > kernel 5.0: > > https://www.spinics.net/lists/kernel/msg2985279.html > > > > We will add emulation of full scalable mode capability along with > > guest iommu driver progress later, e.g.: > > 1. First level translation > > 2. Nested translation > > 3. Per-PASID invalidation descriptors > > 4. Page request services for handling recoverable faults > > > > To verify the patches, below cases were tested according to Peter Xu's > > suggestions. > > +---------+----------------------------------------------------------------+----------------------------------------------------------------+ > > | | w/ Device Passthr | w/o Device Passthr | > > | +-------------------------------+--------------------------------+-------------------------------+--------------------------------+ > > | | virtio-net-pci, vhost=on | virtio-net-pci, vhost=off | virtio-net-pci, vhost=on | virtio-net-pci, vhost=off | > > | +-------------------------------+--------------------------------+-------------------------------+--------------------------------+ > > | | netperf | kernel bld | data cp| netperf | kernel bld | data cp | netperf | kernel bld | data cp| netperf | kernel bld | data cp | > > +---------+-------------------------------+--------------------------------+-------------------------------+--------------------------------+ > > | Legacy | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | > > +---------+-------------------------------+--------------------------------+-------------------------------+--------------------------------+ > > | Scalable| Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | > > +---------+-------------------------------+--------------------------------+-------------------------------+--------------------------------+ > > Hi, Yi, > > Thanks very much for the thorough test matrix! > Thanks for the review and comments! :) > The last thing I'd like to confirm is have you tested device > assignment with v2? And note that when you test with virtio devices Yes, I tested a MDEV assignment which can walk the Scalable Mode patches flows (both kernel and qemu). > you should not need caching-mode=on (but caching-mode=on should not > break anyone though). > For virtio-net-pci without device assignment, I did not use "caching-mode=on". > I've still got some comments here and there but it looks very good at > least to me overall. > > Thanks, > > -- > Peter Xu
> From: Yi Sun [mailto:yi.y.sun@linux.intel.com] > Sent: Friday, March 1, 2019 3:13 PM > > On 19-03-01 15:07:34, Peter Xu wrote: > > On Thu, Feb 28, 2019 at 09:47:54PM +0800, Yi Sun wrote: > > > Intel vt-d rev3.0 [1] introduces a new translation mode called > > > 'scalable mode', which enables PASID-granular translations for > > > first level, second level, nested and pass-through modes. The > > > vt-d scalable mode is the key ingredient to enable Scalable I/O > > > Virtualization (Scalable IOV) [2] [3], which allows sharing a > > > device in minimal possible granularity (ADI - Assignable Device > > > Interface). As a result, previous Extended Context (ECS) mode > > > is deprecated (no production ever implements ECS). > > > > > > This patch set emulates a minimal capability set of VT-d scalable > > > mode, equivalent to what is available in VT-d legacy mode today: > > > 1. Scalable mode root entry, context entry and PASID table > > > 2. Seconds level translation under scalable mode > > > 3. Queued invalidation (with 256 bits descriptor) > > > 4. Pass-through mode > > > > > > Corresponding intel-iommu driver support will be included in > > > kernel 5.0: > > > https://www.spinics.net/lists/kernel/msg2985279.html > > > > > > We will add emulation of full scalable mode capability along with > > > guest iommu driver progress later, e.g.: > > > 1. First level translation > > > 2. Nested translation > > > 3. Per-PASID invalidation descriptors > > > 4. Page request services for handling recoverable faults > > > > > > To verify the patches, below cases were tested according to Peter Xu's > > > suggestions. > > > +---------+----------------------------------------------------------------+----------------------- > -----------------------------------------+ > > > | | w/ Device Passthr | w/o Device > Passthr | > > > | +-------------------------------+--------------------------------+------------------------- > ------+--------------------------------+ > > > | | virtio-net-pci, vhost=on | virtio-net-pci, vhost=off | virtio- > net-pci, vhost=on | virtio-net-pci, vhost=off | > > > | +-------------------------------+--------------------------------+------------------------- > ------+--------------------------------+ > > > | | netperf | kernel bld | data cp| netperf | kernel bld | data cp | > netperf | kernel bld | data cp| netperf | kernel bld | data cp | > > > +---------+-------------------------------+--------------------------------+---------------------- > ---------+--------------------------------+ > > > | Legacy | Pass | Pass | Pass | Pass | Pass | Pass | Pass | > Pass | Pass | Pass | Pass | Pass | > > > +---------+-------------------------------+--------------------------------+---------------------- > ---------+--------------------------------+ > > > | Scalable| Pass | Pass | Pass | Pass | Pass | Pass | Pass | > Pass | Pass | Pass | Pass | Pass | > > > +---------+-------------------------------+--------------------------------+---------------------- > ---------+--------------------------------+ > > > > Hi, Yi, > > > > Thanks very much for the thorough test matrix! > > > Thanks for the review and comments! :) > > > The last thing I'd like to confirm is have you tested device > > assignment with v2? And note that when you test with virtio devices > > Yes, I tested a MDEV assignment which can walk the Scalable Mode > patches flows (both kernel and qemu). not just MDEV. You should also try physical PCI endpoint device. > > > you should not need caching-mode=on (but caching-mode=on should not > > break anyone though). > > > For virtio-net-pci without device assignment, I did not use > "caching-mode=on". > > > I've still got some comments here and there but it looks very good at > > least to me overall. > > > > Thanks, > > > > -- > > Peter Xu
© 2016 - 2025 Red Hat, Inc.