We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing
shadow page tables. Having invalid context entry there is perfectly
valid when we move a device out of an existing domain. When that
happens, instead of posting an error we invalidate the whole region.
Without this patch, QEMU will crash if we do these steps:
(1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe)
(2) bind the NICs with vfio-pci in the guest
(3) start testpmd with the NICs applied
(4) stop testpmd
(5) rebind the NIC back to ixgbe kernel driver
The patch should fix it.
Reported-by: Pei Zhang <pezhang@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/intel_iommu.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index d6b4f8705d..b0884e87e8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -37,6 +37,8 @@
#include "kvm_i386.h"
#include "trace.h"
+static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n);
+
static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
uint64_t wmask, uint64_t w1cmask)
{
@@ -1056,12 +1058,28 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
{
int ret;
VTDContextEntry ce;
+ IOMMUNotifier *n;
ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
pci_bus_num(vtd_as->bus),
vtd_as->devfn, &ce);
if (ret) {
- return ret;
+ if (ret == -VTD_FR_CONTEXT_ENTRY_P) {
+ /*
+ * It's a valid scenario to have a context entry that is
+ * not present. For example, when a device is removed
+ * from an existing domain then the context entry will be
+ * zeroed by the guest before it was put into another
+ * domain. When this happens, instead of synchronizing
+ * the shadow pages we should invalidate all existing
+ * mappings and notify the backends.
+ */
+ IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
+ vtd_address_space_unmap(vtd_as, n);
+ }
+ } else {
+ return ret;
+ }
}
return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX);
--
2.17.1
Hi Peter,
On 10/8/18 8:47 AM, Peter Xu wrote:
> We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing
> shadow page tables. Having invalid context entry there is perfectly
> valid when we move a device out of an existing domain. When that
> happens, instead of posting an error we invalidate the whole region.
>
> Without this patch, QEMU will crash if we do these steps:
>
> (1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe)
> (2) bind the NICs with vfio-pci in the guest
> (3) start testpmd with the NICs applied
> (4) stop testpmd
> (5) rebind the NIC back to ixgbe kernel driver
>
> The patch should fix it.
>
> Reported-by: Pei Zhang <pezhang@redhat.com>
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> hw/i386/intel_iommu.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index d6b4f8705d..b0884e87e8 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -37,6 +37,8 @@
> #include "kvm_i386.h"
> #include "trace.h"
>
> +static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n);
> +
> static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
> uint64_t wmask, uint64_t w1cmask)
> {
> @@ -1056,12 +1058,28 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
> {
> int ret;
> VTDContextEntry ce;
> + IOMMUNotifier *n;
>
> ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
> pci_bus_num(vtd_as->bus),
> vtd_as->devfn, &ce);
> if (ret) {
> - return ret;
> + if (ret == -VTD_FR_CONTEXT_ENTRY_P) {
> + /*
> + * It's a valid scenario to have a context entry that is
> + * not present. For example, when a device is removed
> + * from an existing domain then the context entry will be
> + * zeroed by the guest before it was put into another
> + * domain. When this happens, instead of synchronizing
> + * the shadow pages we should invalidate all existing
> + * mappings and notify the backends.
> + */
> + IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
> + vtd_address_space_unmap(vtd_as, n);
> + }
don't you want to return here also in this case?
Thanks
Eric
> + } else {
> + return ret;
> + }
> }
>
> return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX);
>
On Tue, Oct 09, 2018 at 09:27:15AM +0200, Auger Eric wrote:
[...]
> > @@ -1056,12 +1058,28 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
> > {
> > int ret;
> > VTDContextEntry ce;
> > + IOMMUNotifier *n;
> >
> > ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
> > pci_bus_num(vtd_as->bus),
> > vtd_as->devfn, &ce);
> > if (ret) {
> > - return ret;
> > + if (ret == -VTD_FR_CONTEXT_ENTRY_P) {
> > + /*
> > + * It's a valid scenario to have a context entry that is
> > + * not present. For example, when a device is removed
> > + * from an existing domain then the context entry will be
> > + * zeroed by the guest before it was put into another
> > + * domain. When this happens, instead of synchronizing
> > + * the shadow pages we should invalidate all existing
> > + * mappings and notify the backends.
> > + */
> > + IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
> > + vtd_address_space_unmap(vtd_as, n);
> > + }
> don't you want to return here also in this case?
Yes... a stupid mistake!
I'll repost. Thank you,
--
Peter Xu
© 2016 - 2025 Red Hat, Inc.