[RFC PATCH] VT-d: Don't assume register-based invalidation is always supported

Chao Gao posted 1 patch 3 years, 7 months ago
Test gitlab-ci passed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20210414005526.36760-1-chao.gao@intel.com
docs/misc/xen-command-line.pandoc    |  4 ++-
xen/drivers/passthrough/vtd/iommu.c  | 40 +++++++++++++++++++++++++---
xen/drivers/passthrough/vtd/iommu.h  |  7 +++++
xen/drivers/passthrough/vtd/qinval.c | 33 +++++++++++++++++++++--
4 files changed, 77 insertions(+), 7 deletions(-)
[RFC PATCH] VT-d: Don't assume register-based invalidation is always supported
Posted by Chao Gao 3 years, 7 months ago
According to Intel VT-d SPEC rev3.3 Section 6.5, Register-based Invalidation
isn't supported by Intel VT-d version 6 and beyond.

This hardware change impacts following two scenarios: admin can disable
queued invalidation via 'qinval' cmdline and use register-based interface;
VT-d switches to register-based invalidation when queued invalidation needs
to be disabled, for example, during disabling x2apic or during system
suspension.

To deal with this hardware change, if register-based invalidation isn't
supported, queued invalidation cannot be disabled through Xen cmdline; and
if queued invalidation has to be disabled temporarily in some scenarios,
VT-d won't switch to register-based interface but use some dummy functions
to catch errors in case there is any invalidation request issued when queued
invalidation is disabled.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
I only tested Xen boot with qinval/no-qinval. I also want to do system
suspension and resumption to see if any unexpected error. But I don't
know how to trigger them. Any recommendation?
---
 docs/misc/xen-command-line.pandoc    |  4 ++-
 xen/drivers/passthrough/vtd/iommu.c  | 40 +++++++++++++++++++++++++---
 xen/drivers/passthrough/vtd/iommu.h  |  7 +++++
 xen/drivers/passthrough/vtd/qinval.c | 33 +++++++++++++++++++++--
 4 files changed, 77 insertions(+), 7 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index deef6d0b4c..4ff4a87844 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1442,7 +1442,9 @@ The following options are specific to Intel VT-d hardware:
 *   The `qinval` boolean controls the Queued Invalidation sub-feature, and is
     active by default on compatible hardware.  Queued Invalidation is a
     feature in second-generation IOMMUs and is a functional prerequisite for
-    Interrupt Remapping.
+    Interrupt Remapping. Note that Xen disregards this setting for Intel VT-d
+    version 6 and greater as Registered-Based Invalidation isn't supported
+    by them.
 
 *   The `igfx` boolean is active by default, and controls whether the IOMMU in
     front of an Intel Graphics Device is enabled or not.
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 6428c8fe3e..e738d04543 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1193,6 +1193,14 @@ int __init iommu_alloc(struct acpi_drhd_unit *drhd)
 
     iommu->cap = dmar_readq(iommu->reg, DMAR_CAP_REG);
     iommu->ecap = dmar_readq(iommu->reg, DMAR_ECAP_REG);
+    iommu->version = dmar_readl(iommu->reg, DMAR_VER_REG);
+
+    if ( !iommu_qinval && !has_register_based_invalidation(iommu) )
+    {
+        printk(XENLOG_WARNING VTDPREFIX "IOMMU %d: cannot disable Queued Invalidation.\n",
+               iommu->index);
+        iommu_qinval = true;
+    }
 
     if ( iommu_verbose )
     {
@@ -2231,6 +2239,8 @@ static int __init vtd_setup(void)
     struct acpi_drhd_unit *drhd;
     struct vtd_iommu *iommu;
     int ret;
+    bool queued_inval_supported = true;
+    bool reg_inval_supported = true;
 
     if ( list_empty(&acpi_drhd_units) )
     {
@@ -2252,8 +2262,8 @@ static int __init vtd_setup(void)
     }
 
     /* We enable the following features only if they are supported by all VT-d
-     * engines: Snoop Control, DMA passthrough, Queued Invalidation, Interrupt
-     * Remapping, and Posted Interrupt
+     * engines: Snoop Control, DMA passthrough, Register-based Invalidation,
+     * Queued Invalidation, Interrupt Remapping, and Posted Interrupt.
      */
     for_each_drhd_unit ( drhd )
     {
@@ -2272,8 +2282,11 @@ static int __init vtd_setup(void)
         if ( iommu_hwdom_passthrough && !ecap_pass_thru(iommu->ecap) )
             iommu_hwdom_passthrough = false;
 
-        if ( iommu_qinval && !ecap_queued_inval(iommu->ecap) )
-            iommu_qinval = 0;
+        if ( reg_inval_supported && !has_register_based_invalidation(iommu) )
+            reg_inval_supported = false;
+
+        if ( queued_inval_supported && !ecap_queued_inval(iommu->ecap) )
+            queued_inval_supported = false;
 
         if ( iommu_intremap && !ecap_intr_remap(iommu->ecap) )
             iommu_intremap = iommu_intremap_off;
@@ -2301,6 +2314,25 @@ static int __init vtd_setup(void)
 
     softirq_tasklet_init(&vtd_fault_tasklet, do_iommu_page_fault, NULL);
 
+    if ( !queued_inval_supported && !reg_inval_supported )
+    {
+        dprintk(XENLOG_ERR VTDPREFIX, "No available invalidation interface.\n");
+        ret = -ENODEV;
+        goto error;
+    }
+
+    /*
+     * We cannot have !iommu_qinval && !reg_inval_supported here since
+     * iommu_qinval is set in iommu_alloc() if any iommu doesn't support
+     * Register-based Invalidation.
+     */
+    if ( iommu_qinval && !queued_inval_supported )
+    {
+        dprintk(XENLOG_WARNING VTDPREFIX, "Disable Queued Invalidation as "
+                "it isn't supported.\n");
+        iommu_qinval = false;
+    }
+
     if ( !iommu_qinval && iommu_intremap )
     {
         iommu_intremap = iommu_intremap_off;
diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 216791b3d6..644224051a 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -540,6 +540,7 @@ struct vtd_iommu {
     struct list_head ats_devices;
     unsigned long *domid_bitmap;  /* domain id bitmap */
     u16 *domid_map;               /* domain id mapping array */
+    u32 version;
 };
 
 #define INTEL_IOMMU_DEBUG(fmt, args...) \
@@ -549,4 +550,10 @@ struct vtd_iommu {
             dprintk(XENLOG_WARNING VTDPREFIX, fmt, ## args);    \
     } while(0)
 
+/* Register-based invalidation isn't supported by VT-d version 6 and beyond. */
+static inline bool has_register_based_invalidation(struct vtd_iommu *iommu)
+{
+    return VER_MAJOR(iommu->version) < 6;
+}
+
 #endif
diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c
index 764ef9f0fc..1e90f50ff8 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -442,6 +442,23 @@ int enable_qinval(struct vtd_iommu *iommu)
     return 0;
 }
 
+static int vtd_flush_context_noop(struct vtd_iommu *iommu, uint16_t did,
+                                  uint16_t source_id, uint8_t function_mask,
+                                  uint64_t type, bool flush_non_present_entry)
+{
+    dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: Cannot flush CONTEXT.\n");
+    return -EIO;
+}
+
+static int vtd_flush_iotlb_noop(struct vtd_iommu *iommu, uint16_t did,
+                                uint64_t addr, unsigned int size_order,
+                                uint64_t type, bool flush_non_present_entry,
+                                bool flush_dev_iotlb)
+{
+    dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: Cannot flush IOTLB.\n");
+    return -EIO;
+}
+
 void disable_qinval(struct vtd_iommu *iommu)
 {
     u32 sts;
@@ -463,6 +480,18 @@ void disable_qinval(struct vtd_iommu *iommu)
 out:
     spin_unlock_irqrestore(&iommu->register_lock, flags);
 
-    iommu->flush.context = vtd_flush_context_reg;
-    iommu->flush.iotlb   = vtd_flush_iotlb_reg;
+    /*
+     * Assign callbacks to noop to catch errors if register-based invalidation
+     * isn't supported.
+     */
+    if ( has_register_based_invalidation(iommu) )
+    {
+        iommu->flush.context = vtd_flush_context_reg;
+        iommu->flush.iotlb   = vtd_flush_iotlb_reg;
+    }
+    else
+    {
+        iommu->flush.context = vtd_flush_context_noop;
+        iommu->flush.iotlb = vtd_flush_iotlb_noop;
+    }
 }
-- 
2.25.1


Re: [RFC PATCH] VT-d: Don't assume register-based invalidation is always supported
Posted by Jan Beulich 3 years, 7 months ago
On 14.04.2021 02:55, Chao Gao wrote:
> According to Intel VT-d SPEC rev3.3 Section 6.5, Register-based Invalidation
> isn't supported by Intel VT-d version 6 and beyond.
> 
> This hardware change impacts following two scenarios: admin can disable
> queued invalidation via 'qinval' cmdline and use register-based interface;
> VT-d switches to register-based invalidation when queued invalidation needs
> to be disabled, for example, during disabling x2apic or during system
> suspension.
> 
> To deal with this hardware change, if register-based invalidation isn't
> supported, queued invalidation cannot be disabled through Xen cmdline; and
> if queued invalidation has to be disabled temporarily in some scenarios,
> VT-d won't switch to register-based interface but use some dummy functions
> to catch errors in case there is any invalidation request issued when queued
> invalidation is disabled.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> I only tested Xen boot with qinval/no-qinval. I also want to do system
> suspension and resumption to see if any unexpected error. But I don't
> know how to trigger them. Any recommendation?

Iirc, if your distro doesn't support a proper interface for this, it's
as simple as "echo mem >/sys/power/state".

> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1193,6 +1193,14 @@ int __init iommu_alloc(struct acpi_drhd_unit *drhd)
>  
>      iommu->cap = dmar_readq(iommu->reg, DMAR_CAP_REG);
>      iommu->ecap = dmar_readq(iommu->reg, DMAR_ECAP_REG);
> +    iommu->version = dmar_readl(iommu->reg, DMAR_VER_REG);
> +
> +    if ( !iommu_qinval && !has_register_based_invalidation(iommu) )
> +    {
> +        printk(XENLOG_WARNING VTDPREFIX "IOMMU %d: cannot disable Queued Invalidation.\n",
> +               iommu->index);
> +        iommu_qinval = true;
> +    }

With this I don't see ...

> @@ -2231,6 +2239,8 @@ static int __init vtd_setup(void)
>      struct acpi_drhd_unit *drhd;
>      struct vtd_iommu *iommu;
>      int ret;
> +    bool queued_inval_supported = true;
> +    bool reg_inval_supported = true;

... the need for the first variable here. You'd simply ...

> @@ -2272,8 +2282,11 @@ static int __init vtd_setup(void)
>          if ( iommu_hwdom_passthrough && !ecap_pass_thru(iommu->ecap) )
>              iommu_hwdom_passthrough = false;
>  
> -        if ( iommu_qinval && !ecap_queued_inval(iommu->ecap) )
> -            iommu_qinval = 0;

... clear iommu_qinval here again, and use that in the 1st if() you
add in the next hunk; the 2nd if() there would go away again.

> +        if ( reg_inval_supported && !has_register_based_invalidation(iommu) )
> +            reg_inval_supported = false;
> +
> +        if ( queued_inval_supported && !ecap_queued_inval(iommu->ecap) )
> +            queued_inval_supported = false;

I don't see the need for the left sides of the && in both of these
(or in fact any of the pre-existing) if()-s. (Of course this is not
a request to also adjust the ones that are already there.)

> @@ -2301,6 +2314,25 @@ static int __init vtd_setup(void)
>  
>      softirq_tasklet_init(&vtd_fault_tasklet, do_iommu_page_fault, NULL);
>  
> +    if ( !queued_inval_supported && !reg_inval_supported )
> +    {
> +        dprintk(XENLOG_ERR VTDPREFIX, "No available invalidation interface.\n");
> +        ret = -ENODEV;
> +        goto error;
> +    }
> +
> +    /*
> +     * We cannot have !iommu_qinval && !reg_inval_supported here since
> +     * iommu_qinval is set in iommu_alloc() if any iommu doesn't support
> +     * Register-based Invalidation.
> +     */
> +    if ( iommu_qinval && !queued_inval_supported )
> +    {
> +        dprintk(XENLOG_WARNING VTDPREFIX, "Disable Queued Invalidation as "
> +                "it isn't supported.\n");
> +        iommu_qinval = false;
> +    }
> +
>      if ( !iommu_qinval && iommu_intremap )
>      {
>          iommu_intremap = iommu_intremap_off;
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -540,6 +540,7 @@ struct vtd_iommu {
>      struct list_head ats_devices;
>      unsigned long *domid_bitmap;  /* domain id bitmap */
>      u16 *domid_map;               /* domain id mapping array */
> +    u32 version;

Nit: uint32_t please in new code, assuming a fixed-width value is
needed here in the first place (see ./CODING_STYLE).

> @@ -549,4 +550,10 @@ struct vtd_iommu {
>              dprintk(XENLOG_WARNING VTDPREFIX, fmt, ## args);    \
>      } while(0)
>  
> +/* Register-based invalidation isn't supported by VT-d version 6 and beyond. */
> +static inline bool has_register_based_invalidation(struct vtd_iommu *iommu)

"const" please

> @@ -463,6 +480,18 @@ void disable_qinval(struct vtd_iommu *iommu)
>  out:
>      spin_unlock_irqrestore(&iommu->register_lock, flags);
>  
> -    iommu->flush.context = vtd_flush_context_reg;
> -    iommu->flush.iotlb   = vtd_flush_iotlb_reg;
> +    /*
> +     * Assign callbacks to noop to catch errors if register-based invalidation
> +     * isn't supported.
> +     */
> +    if ( has_register_based_invalidation(iommu) )
> +    {
> +        iommu->flush.context = vtd_flush_context_reg;
> +        iommu->flush.iotlb   = vtd_flush_iotlb_reg;
> +    }
> +    else
> +    {
> +        iommu->flush.context = vtd_flush_context_noop;
> +        iommu->flush.iotlb = vtd_flush_iotlb_noop;

Nit: Would be nice if aligning (or not) the = operators was done
the same in both cases.

Seeing this part of the change I wonder whether you shouldn't also
alter the other place where the register-based invalidation hooks
get put in place: It can't be right to install them when enabling
qinval fails but register-based invalidation is also not available.
I guess, no matter how much we'd like to avoid such, panic() may be
needed there in this case, or IOMMU initialization as a whole needs
to be failed.

Jan

Re: [RFC PATCH] VT-d: Don't assume register-based invalidation is always supported
Posted by Chao Gao 3 years, 7 months ago
On Wed, Apr 14, 2021 at 12:07:02PM +0200, Jan Beulich wrote:
>On 14.04.2021 02:55, Chao Gao wrote:
>> According to Intel VT-d SPEC rev3.3 Section 6.5, Register-based Invalidation
>> isn't supported by Intel VT-d version 6 and beyond.
>> 
>> This hardware change impacts following two scenarios: admin can disable
>> queued invalidation via 'qinval' cmdline and use register-based interface;
>> VT-d switches to register-based invalidation when queued invalidation needs
>> to be disabled, for example, during disabling x2apic or during system
>> suspension.
>> 
>> To deal with this hardware change, if register-based invalidation isn't
>> supported, queued invalidation cannot be disabled through Xen cmdline; and
>> if queued invalidation has to be disabled temporarily in some scenarios,
>> VT-d won't switch to register-based interface but use some dummy functions
>> to catch errors in case there is any invalidation request issued when queued
>> invalidation is disabled.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> I only tested Xen boot with qinval/no-qinval. I also want to do system
>> suspension and resumption to see if any unexpected error. But I don't
>> know how to trigger them. Any recommendation?
>
>Iirc, if your distro doesn't support a proper interface for this, it's
>as simple as "echo mem >/sys/power/state".

Thanks. I will give it a try. And all your comments make a lot of sense.
Will fix all of them in the next version.

Chao