[Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority

Peter Xu posted 1 patch 6 years, 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20180201112028.23552-1-peterx@redhat.com
Test checkpatch passed
Test docker-build@min-glib passed
Test docker-mingw@fedora passed
Test docker-quick@centos6 passed
Test ppc passed
Test s390x passed
hw/pci-bridge/gen_pcie_root_port.c | 1 +
include/migration/vmstate.h        | 1 +
2 files changed, 2 insertions(+)
[Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Peter Xu 6 years, 2 months ago
In the past, we prioritized IOMMU migration so that we have such a
priority order:

    IOMMU > PCI Devices

When migrating a guest with both vIOMMU and pcie-root-port, we'll always
migrate vIOMMU first, since pcie-root-port will be seen to have the same
priority of general PCI devices.

That's problematic.

The thing is that PCI bus number information is stored in the root port,
and that is needed by vIOMMU during post_load(), e.g., to figure out
context entry for a device.  If we don't have correct bus numbers for
devices, we won't be able to recover device state of the DMAR memory
regions, and things will be messed up.

So let's boost the PCIe root ports to be even with higher priority:

   PCIe Root Port > IOMMU > PCI Devices

A smoke test shows that this patch fixes bug 1538953.

CC: Alex Williamson <alex.williamson@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Laurent Vivier <lvivier@redhat.com>
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
Marcel & all,

I think it's possible that we need similar thing for other bridge-like
devices, but I'm not that familiar.  Would you help confirm?  Thanks,
---
 hw/pci-bridge/gen_pcie_root_port.c | 1 +
 include/migration/vmstate.h        | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
index 0e2f2e8bf1..e6ff1effd8 100644
--- a/hw/pci-bridge/gen_pcie_root_port.c
+++ b/hw/pci-bridge/gen_pcie_root_port.c
@@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
 
 static const VMStateDescription vmstate_rp_dev = {
     .name = "pcie-root-port",
+    .priority = MIG_PRI_PCIE_ROOT_PORT,
     .version_id = 1,
     .minimum_version_id = 1,
     .post_load = pcie_cap_slot_post_load,
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 8c3889433c..491449db9f 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -148,6 +148,7 @@ enum VMStateFlags {
 typedef enum {
     MIG_PRI_DEFAULT = 0,
     MIG_PRI_IOMMU,              /* Must happen before PCI devices */
+    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
     MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
     MIG_PRI_GICV3,              /* Must happen before the ITS */
     MIG_PRI_MAX,
-- 
2.14.3


Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Marcel Apfelbaum 6 years, 2 months ago
Hi Peter,

On 01/02/2018 13:20, Peter Xu wrote:
> In the past, we prioritized IOMMU migration so that we have such a
> priority order:
> 
>     IOMMU > PCI Devices
> 
> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> migrate vIOMMU first, since pcie-root-port will be seen to have the same
> priority of general PCI devices.
> 
> That's problematic.
> 
> The thing is that PCI bus number information is stored in the root port,
> and that is needed by vIOMMU during post_load(), e.g., to figure out
> context entry for a device.  If we don't have correct bus numbers for
> devices, we won't be able to recover device state of the DMAR memory
> regions, and things will be messed up.
> 
> So let's boost the PCIe root ports to be even with higher priority:
> 
>    PCIe Root Port > IOMMU > PCI Devices
> 
> A smoke test shows that this patch fixes bug 1538953.
> 
> CC: Alex Williamson <alex.williamson@redhat.com>
> CC: Marcel Apfelbaum <marcel@redhat.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Laurent Vivier <lvivier@redhat.com>
> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> Marcel & all,
> 
> I think it's possible that we need similar thing for other bridge-like
> devices, but I'm not that familiar.  Would you help confirm?  Thanks,

Is a pity we don't have a way to mark the migration priority
in a base class. Dave, maybe we do have a way?

In the meantime you would need to add it also to:
- ioh3420 (Intel root port)
- xio3130_downstream (Intel switch downstream port)
- xio3130_upstream (The counterpart of the above, you want the whole
  switch to be migrated before loading the IOMMU device state)
- pcie_pci_bridge (for pci devices)
- pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
- i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)

Thanks,
Marcel

>  hw/pci-bridge/gen_pcie_root_port.c | 1 +
>  include/migration/vmstate.h        | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> index 0e2f2e8bf1..e6ff1effd8 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
>  
>  static const VMStateDescription vmstate_rp_dev = {
>      .name = "pcie-root-port",
> +    .priority = MIG_PRI_PCIE_ROOT_PORT,
>      .version_id = 1,
>      .minimum_version_id = 1,
>      .post_load = pcie_cap_slot_post_load,
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 8c3889433c..491449db9f 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -148,6 +148,7 @@ enum VMStateFlags {
>  typedef enum {
>      MIG_PRI_DEFAULT = 0,
>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
>      MIG_PRI_GICV3,              /* Must happen before the ITS */
>      MIG_PRI_MAX,
> 


Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Michael S. Tsirkin 6 years, 2 months ago
On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
> Hi Peter,
> 
> On 01/02/2018 13:20, Peter Xu wrote:
> > In the past, we prioritized IOMMU migration so that we have such a
> > priority order:
> > 
> >     IOMMU > PCI Devices
> > 
> > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > priority of general PCI devices.
> > 
> > That's problematic.
> > 
> > The thing is that PCI bus number information is stored in the root port,
> > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > context entry for a device.  If we don't have correct bus numbers for
> > devices, we won't be able to recover device state of the DMAR memory
> > regions, and things will be messed up.
> > 
> > So let's boost the PCIe root ports to be even with higher priority:
> > 
> >    PCIe Root Port > IOMMU > PCI Devices
> > 
> > A smoke test shows that this patch fixes bug 1538953.
> > 
> > CC: Alex Williamson <alex.williamson@redhat.com>
> > CC: Marcel Apfelbaum <marcel@redhat.com>
> > CC: Michael S. Tsirkin <mst@redhat.com>
> > CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > CC: Juan Quintela <quintela@redhat.com>
> > CC: Laurent Vivier <lvivier@redhat.com>
> > Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> > Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > Marcel & all,
> > 
> > I think it's possible that we need similar thing for other bridge-like
> > devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> 
> Is a pity we don't have a way to mark the migration priority
> in a base class. Dave, maybe we do have a way?
> 
> In the meantime you would need to add it also to:
> - ioh3420 (Intel root port)
> - xio3130_downstream (Intel switch downstream port)
> - xio3130_upstream (The counterpart of the above, you want the whole
>   switch to be migrated before loading the IOMMU device state)
> - pcie_pci_bridge (for pci devices)
> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
> 
> Thanks,
> Marcel

It's kind of strange that we need to set the priority manually.
Can't migration figure it out itself? I think bus
must always be migrated before the devices behind it ...

> >  hw/pci-bridge/gen_pcie_root_port.c | 1 +
> >  include/migration/vmstate.h        | 1 +
> >  2 files changed, 2 insertions(+)
> > 
> > diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> > index 0e2f2e8bf1..e6ff1effd8 100644
> > --- a/hw/pci-bridge/gen_pcie_root_port.c
> > +++ b/hw/pci-bridge/gen_pcie_root_port.c
> > @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
> >  
> >  static const VMStateDescription vmstate_rp_dev = {
> >      .name = "pcie-root-port",
> > +    .priority = MIG_PRI_PCIE_ROOT_PORT,
> >      .version_id = 1,
> >      .minimum_version_id = 1,
> >      .post_load = pcie_cap_slot_post_load,
> > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > index 8c3889433c..491449db9f 100644
> > --- a/include/migration/vmstate.h
> > +++ b/include/migration/vmstate.h
> > @@ -148,6 +148,7 @@ enum VMStateFlags {
> >  typedef enum {
> >      MIG_PRI_DEFAULT = 0,
> >      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> > +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
> >      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
> >      MIG_PRI_GICV3,              /* Must happen before the ITS */
> >      MIG_PRI_MAX,
> > 

Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Dr. David Alan Gilbert 6 years, 2 months ago
* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
> > Hi Peter,
> > 
> > On 01/02/2018 13:20, Peter Xu wrote:
> > > In the past, we prioritized IOMMU migration so that we have such a
> > > priority order:
> > > 
> > >     IOMMU > PCI Devices
> > > 
> > > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > > priority of general PCI devices.
> > > 
> > > That's problematic.
> > > 
> > > The thing is that PCI bus number information is stored in the root port,
> > > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > > context entry for a device.  If we don't have correct bus numbers for
> > > devices, we won't be able to recover device state of the DMAR memory
> > > regions, and things will be messed up.
> > > 
> > > So let's boost the PCIe root ports to be even with higher priority:
> > > 
> > >    PCIe Root Port > IOMMU > PCI Devices
> > > 
> > > A smoke test shows that this patch fixes bug 1538953.
> > > 
> > > CC: Alex Williamson <alex.williamson@redhat.com>
> > > CC: Marcel Apfelbaum <marcel@redhat.com>
> > > CC: Michael S. Tsirkin <mst@redhat.com>
> > > CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > CC: Juan Quintela <quintela@redhat.com>
> > > CC: Laurent Vivier <lvivier@redhat.com>
> > > Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> > > Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > > Marcel & all,
> > > 
> > > I think it's possible that we need similar thing for other bridge-like
> > > devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> > 
> > Is a pity we don't have a way to mark the migration priority
> > in a base class. Dave, maybe we do have a way?
> > 
> > In the meantime you would need to add it also to:
> > - ioh3420 (Intel root port)
> > - xio3130_downstream (Intel switch downstream port)
> > - xio3130_upstream (The counterpart of the above, you want the whole
> >   switch to be migrated before loading the IOMMU device state)
> > - pcie_pci_bridge (for pci devices)
> > - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> > - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
> > 
> > Thanks,
> > Marcel
> 
> It's kind of strange that we need to set the priority manually.
> Can't migration figure it out itself?

> I think bus
> must always be migrated before the devices behind it ...

I think that's true; but:
  a) Is the iommu a child of any of the PCI busses?
  b) does anything ensure that the bridge that's a parent of a bus
     gets migrated before the bus it provides?
  c) What happens with more htan one root port?

Dave


> > >  hw/pci-bridge/gen_pcie_root_port.c | 1 +
> > >  include/migration/vmstate.h        | 1 +
> > >  2 files changed, 2 insertions(+)
> > > 
> > > diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> > > index 0e2f2e8bf1..e6ff1effd8 100644
> > > --- a/hw/pci-bridge/gen_pcie_root_port.c
> > > +++ b/hw/pci-bridge/gen_pcie_root_port.c
> > > @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
> > >  
> > >  static const VMStateDescription vmstate_rp_dev = {
> > >      .name = "pcie-root-port",
> > > +    .priority = MIG_PRI_PCIE_ROOT_PORT,
> > >      .version_id = 1,
> > >      .minimum_version_id = 1,
> > >      .post_load = pcie_cap_slot_post_load,
> > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > > index 8c3889433c..491449db9f 100644
> > > --- a/include/migration/vmstate.h
> > > +++ b/include/migration/vmstate.h
> > > @@ -148,6 +148,7 @@ enum VMStateFlags {
> > >  typedef enum {
> > >      MIG_PRI_DEFAULT = 0,
> > >      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> > > +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
> > >      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
> > >      MIG_PRI_GICV3,              /* Must happen before the ITS */
> > >      MIG_PRI_MAX,
> > > 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Marcel Apfelbaum 6 years, 2 months ago
On 01/02/2018 21:48, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (mst@redhat.com) wrote:
>> On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
>>> Hi Peter,
>>>
>>> On 01/02/2018 13:20, Peter Xu wrote:
>>>> In the past, we prioritized IOMMU migration so that we have such a
>>>> priority order:
>>>>
>>>>     IOMMU > PCI Devices
>>>>
>>>> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
>>>> migrate vIOMMU first, since pcie-root-port will be seen to have the same
>>>> priority of general PCI devices.
>>>>
>>>> That's problematic.
>>>>
>>>> The thing is that PCI bus number information is stored in the root port,
>>>> and that is needed by vIOMMU during post_load(), e.g., to figure out
>>>> context entry for a device.  If we don't have correct bus numbers for
>>>> devices, we won't be able to recover device state of the DMAR memory
>>>> regions, and things will be messed up.
>>>>
>>>> So let's boost the PCIe root ports to be even with higher priority:
>>>>
>>>>    PCIe Root Port > IOMMU > PCI Devices
>>>>
>>>> A smoke test shows that this patch fixes bug 1538953.
>>>>
>>>> CC: Alex Williamson <alex.williamson@redhat.com>
>>>> CC: Marcel Apfelbaum <marcel@redhat.com>
>>>> CC: Michael S. Tsirkin <mst@redhat.com>
>>>> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>> CC: Juan Quintela <quintela@redhat.com>
>>>> CC: Laurent Vivier <lvivier@redhat.com>
>>>> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
>>>> Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>>> ---
>>>> Marcel & all,
>>>>
>>>> I think it's possible that we need similar thing for other bridge-like
>>>> devices, but I'm not that familiar.  Would you help confirm?  Thanks,
>>>
>>> Is a pity we don't have a way to mark the migration priority
>>> in a base class. Dave, maybe we do have a way?
>>>
>>> In the meantime you would need to add it also to:
>>> - ioh3420 (Intel root port)
>>> - xio3130_downstream (Intel switch downstream port)
>>> - xio3130_upstream (The counterpart of the above, you want the whole
>>>   switch to be migrated before loading the IOMMU device state)
>>> - pcie_pci_bridge (for pci devices)
>>> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
>>> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
>>>
>>> Thanks,
>>> Marcel
>>
>> It's kind of strange that we need to set the priority manually.
>> Can't migration figure it out itself?
> 
>> I think bus
>> must always be migrated before the devices behind it ...
> 
> I think that's true; but:
>   a) Is the iommu a child of any of the PCI busses?

No, is a sysbus device.

>   b) does anything ensure that the bridge that's a parent of a bus
>      gets migrated before the bus it provides?

I think this was Michael's question :)

>   c) What happens with more htan one root port?
> 

Root ports can't be nested, anyway, I suppose the migration should
follow the bus numbering order.

The question now is what happens if the migration is happening before
the guest firmware finishes assigning numbers to buses...

Still, QEMU has enough information to decide the right ordering,
the question is if the current migration mechanism has some ordering
or the only one is the "VMStateFlags".

Thanks,
Marcel

> Dave
> 
> 
>>>>  hw/pci-bridge/gen_pcie_root_port.c | 1 +
>>>>  include/migration/vmstate.h        | 1 +
>>>>  2 files changed, 2 insertions(+)
>>>>
>>>> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
>>>> index 0e2f2e8bf1..e6ff1effd8 100644
>>>> --- a/hw/pci-bridge/gen_pcie_root_port.c
>>>> +++ b/hw/pci-bridge/gen_pcie_root_port.c
>>>> @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
>>>>  
>>>>  static const VMStateDescription vmstate_rp_dev = {
>>>>      .name = "pcie-root-port",
>>>> +    .priority = MIG_PRI_PCIE_ROOT_PORT,
>>>>      .version_id = 1,
>>>>      .minimum_version_id = 1,
>>>>      .post_load = pcie_cap_slot_post_load,
>>>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
>>>> index 8c3889433c..491449db9f 100644
>>>> --- a/include/migration/vmstate.h
>>>> +++ b/include/migration/vmstate.h
>>>> @@ -148,6 +148,7 @@ enum VMStateFlags {
>>>>  typedef enum {
>>>>      MIG_PRI_DEFAULT = 0,
>>>>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
>>>> +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
>>>>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
>>>>      MIG_PRI_GICV3,              /* Must happen before the ITS */
>>>>      MIG_PRI_MAX,
>>>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 


Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Dr. David Alan Gilbert 6 years, 2 months ago
* Marcel Apfelbaum (marcel@redhat.com) wrote:
> On 01/02/2018 21:48, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (mst@redhat.com) wrote:
> >> On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
> >>> Hi Peter,
> >>>
> >>> On 01/02/2018 13:20, Peter Xu wrote:
> >>>> In the past, we prioritized IOMMU migration so that we have such a
> >>>> priority order:
> >>>>
> >>>>     IOMMU > PCI Devices
> >>>>
> >>>> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> >>>> migrate vIOMMU first, since pcie-root-port will be seen to have the same
> >>>> priority of general PCI devices.
> >>>>
> >>>> That's problematic.
> >>>>
> >>>> The thing is that PCI bus number information is stored in the root port,
> >>>> and that is needed by vIOMMU during post_load(), e.g., to figure out
> >>>> context entry for a device.  If we don't have correct bus numbers for
> >>>> devices, we won't be able to recover device state of the DMAR memory
> >>>> regions, and things will be messed up.
> >>>>
> >>>> So let's boost the PCIe root ports to be even with higher priority:
> >>>>
> >>>>    PCIe Root Port > IOMMU > PCI Devices
> >>>>
> >>>> A smoke test shows that this patch fixes bug 1538953.
> >>>>
> >>>> CC: Alex Williamson <alex.williamson@redhat.com>
> >>>> CC: Marcel Apfelbaum <marcel@redhat.com>
> >>>> CC: Michael S. Tsirkin <mst@redhat.com>
> >>>> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >>>> CC: Juan Quintela <quintela@redhat.com>
> >>>> CC: Laurent Vivier <lvivier@redhat.com>
> >>>> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> >>>> Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>> Signed-off-by: Peter Xu <peterx@redhat.com>
> >>>> ---
> >>>> Marcel & all,
> >>>>
> >>>> I think it's possible that we need similar thing for other bridge-like
> >>>> devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> >>>
> >>> Is a pity we don't have a way to mark the migration priority
> >>> in a base class. Dave, maybe we do have a way?
> >>>
> >>> In the meantime you would need to add it also to:
> >>> - ioh3420 (Intel root port)
> >>> - xio3130_downstream (Intel switch downstream port)
> >>> - xio3130_upstream (The counterpart of the above, you want the whole
> >>>   switch to be migrated before loading the IOMMU device state)
> >>> - pcie_pci_bridge (for pci devices)
> >>> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> >>> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
> >>>
> >>> Thanks,
> >>> Marcel
> >>
> >> It's kind of strange that we need to set the priority manually.
> >> Can't migration figure it out itself?
> > 
> >> I think bus
> >> must always be migrated before the devices behind it ...
> > 
> > I think that's true; but:
> >   a) Is the iommu a child of any of the PCI busses?
> 
> No, is a sysbus device.

OK, so even if we were arguing about whether PCI busses got migrated in
order, that doesn't help Peter's case, because there's nothing that
orders the iommu relative to the PCI.

> >   b) does anything ensure that the bridge that's a parent of a bus
> >      gets migrated before the bus it provides?
> 
> I think this was Michael's question :)
> 
> >   c) What happens with more htan one root port?
> > 
> 
> Root ports can't be nested, anyway, I suppose the migration should
> follow the bus numbering order.
> 
> The question now is what happens if the migration is happening before
> the guest firmware finishes assigning numbers to buses...
> 
> Still, QEMU has enough information to decide the right ordering,
> the question is if the current migration mechanism has some ordering
> or the only one is the "VMStateFlags".

It's ordered on two things:
  a) The priority field that Peter is using
  b) The order of registration of the device.

(b) is of course dangerously unstable.

Dave

> Thanks,
> Marcel
> 
> > Dave
> > 
> > 
> >>>>  hw/pci-bridge/gen_pcie_root_port.c | 1 +
> >>>>  include/migration/vmstate.h        | 1 +
> >>>>  2 files changed, 2 insertions(+)
> >>>>
> >>>> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> >>>> index 0e2f2e8bf1..e6ff1effd8 100644
> >>>> --- a/hw/pci-bridge/gen_pcie_root_port.c
> >>>> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> >>>> @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
> >>>>  
> >>>>  static const VMStateDescription vmstate_rp_dev = {
> >>>>      .name = "pcie-root-port",
> >>>> +    .priority = MIG_PRI_PCIE_ROOT_PORT,
> >>>>      .version_id = 1,
> >>>>      .minimum_version_id = 1,
> >>>>      .post_load = pcie_cap_slot_post_load,
> >>>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> >>>> index 8c3889433c..491449db9f 100644
> >>>> --- a/include/migration/vmstate.h
> >>>> +++ b/include/migration/vmstate.h
> >>>> @@ -148,6 +148,7 @@ enum VMStateFlags {
> >>>>  typedef enum {
> >>>>      MIG_PRI_DEFAULT = 0,
> >>>>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> >>>> +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
> >>>>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
> >>>>      MIG_PRI_GICV3,              /* Must happen before the ITS */
> >>>>      MIG_PRI_MAX,
> >>>>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Peter Xu 6 years, 2 months ago
On Thu, Feb 01, 2018 at 10:01:31PM +0200, Marcel Apfelbaum wrote:

[...]

> Root ports can't be nested, anyway, I suppose the migration should
> follow the bus numbering order.

Could I ask whether this is a must?  And if yes, why?

> 
> The question now is what happens if the migration is happening before
> the guest firmware finishes assigning numbers to buses...

Do you mean that vIOMMU may fetch wrong context entries too?

Note that as long as vIOMMU DMAR is off globally, vIOMMU will not
fetch context entries at all.  So IMHO this problem should not happen
during the firmware execution time (assuming that the firmware should
not enable vIOMMU at all).

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Marcel Apfelbaum 6 years, 2 months ago
On 02/02/2018 12:04, Peter Xu wrote:
> On Thu, Feb 01, 2018 at 10:01:31PM +0200, Marcel Apfelbaum wrote:
> 
> [...]
> 
>> Root ports can't be nested, anyway, I suppose the migration should
>> follow the bus numbering order.
> 
> Could I ask whether this is a must?  And if yes, why?
> 

Not sure. The above will ensure that if a device needs some parent/bus
info at load time, the information will be valid.
But if it worked until now, maybe most of the devices do not need that.

>>
>> The question now is what happens if the migration is happening before
>> the guest firmware finishes assigning numbers to buses...
> 
> Do you mean that vIOMMU may fetch wrong context entries too?
> 

No, only that the bus number will not be available at load time.
In this case is OK since the firmware will continue to
assign bus numbers at target side.

Thanks,
Marcel

> Note that as long as vIOMMU DMAR is off globally, vIOMMU will not
> fetch context entries at all.  So IMHO this problem should not happen
> during the firmware execution time (assuming that the firmware should
> not enable vIOMMU at all).
> 
> Thanks,
> 


Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Dr. David Alan Gilbert 6 years, 2 months ago
* Marcel Apfelbaum (marcel@redhat.com) wrote:
> Hi Peter,
> 
> On 01/02/2018 13:20, Peter Xu wrote:
> > In the past, we prioritized IOMMU migration so that we have such a
> > priority order:
> > 
> >     IOMMU > PCI Devices
> > 
> > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > priority of general PCI devices.
> > 
> > That's problematic.
> > 
> > The thing is that PCI bus number information is stored in the root port,
> > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > context entry for a device.  If we don't have correct bus numbers for
> > devices, we won't be able to recover device state of the DMAR memory
> > regions, and things will be messed up.
> > 
> > So let's boost the PCIe root ports to be even with higher priority:
> > 
> >    PCIe Root Port > IOMMU > PCI Devices
> > 
> > A smoke test shows that this patch fixes bug 1538953.
> > 
> > CC: Alex Williamson <alex.williamson@redhat.com>
> > CC: Marcel Apfelbaum <marcel@redhat.com>
> > CC: Michael S. Tsirkin <mst@redhat.com>
> > CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > CC: Juan Quintela <quintela@redhat.com>
> > CC: Laurent Vivier <lvivier@redhat.com>
> > Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> > Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > Marcel & all,
> > 
> > I think it's possible that we need similar thing for other bridge-like
> > devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> 
> Is a pity we don't have a way to mark the migration priority
> in a base class. Dave, maybe we do have a way?

Not that I'm aware of; the 'priority' field is associated with the VMSD
and it's not really connected to the class hierarchy at all.

Dave

> In the meantime you would need to add it also to:
> - ioh3420 (Intel root port)
> - xio3130_downstream (Intel switch downstream port)
> - xio3130_upstream (The counterpart of the above, you want the whole
>   switch to be migrated before loading the IOMMU device state)
> - pcie_pci_bridge (for pci devices)
> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)
> 
> Thanks,
> Marcel
> 
> >  hw/pci-bridge/gen_pcie_root_port.c | 1 +
> >  include/migration/vmstate.h        | 1 +
> >  2 files changed, 2 insertions(+)
> > 
> > diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> > index 0e2f2e8bf1..e6ff1effd8 100644
> > --- a/hw/pci-bridge/gen_pcie_root_port.c
> > +++ b/hw/pci-bridge/gen_pcie_root_port.c
> > @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
> >  
> >  static const VMStateDescription vmstate_rp_dev = {
> >      .name = "pcie-root-port",
> > +    .priority = MIG_PRI_PCIE_ROOT_PORT,
> >      .version_id = 1,
> >      .minimum_version_id = 1,
> >      .post_load = pcie_cap_slot_post_load,
> > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > index 8c3889433c..491449db9f 100644
> > --- a/include/migration/vmstate.h
> > +++ b/include/migration/vmstate.h
> > @@ -148,6 +148,7 @@ enum VMStateFlags {
> >  typedef enum {
> >      MIG_PRI_DEFAULT = 0,
> >      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> > +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
> >      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
> >      MIG_PRI_GICV3,              /* Must happen before the ITS */
> >      MIG_PRI_MAX,
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Peter Xu 6 years, 2 months ago
On Thu, Feb 01, 2018 at 02:24:15PM +0200, Marcel Apfelbaum wrote:
> Hi Peter,
> 
> On 01/02/2018 13:20, Peter Xu wrote:
> > In the past, we prioritized IOMMU migration so that we have such a
> > priority order:
> > 
> >     IOMMU > PCI Devices
> > 
> > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > priority of general PCI devices.
> > 
> > That's problematic.
> > 
> > The thing is that PCI bus number information is stored in the root port,
> > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > context entry for a device.  If we don't have correct bus numbers for
> > devices, we won't be able to recover device state of the DMAR memory
> > regions, and things will be messed up.
> > 
> > So let's boost the PCIe root ports to be even with higher priority:
> > 
> >    PCIe Root Port > IOMMU > PCI Devices
> > 
> > A smoke test shows that this patch fixes bug 1538953.
> > 
> > CC: Alex Williamson <alex.williamson@redhat.com>
> > CC: Marcel Apfelbaum <marcel@redhat.com>
> > CC: Michael S. Tsirkin <mst@redhat.com>
> > CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > CC: Juan Quintela <quintela@redhat.com>
> > CC: Laurent Vivier <lvivier@redhat.com>
> > Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> > Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > Marcel & all,
> > 
> > I think it's possible that we need similar thing for other bridge-like
> > devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> 
> Is a pity we don't have a way to mark the migration priority
> in a base class. Dave, maybe we do have a way?
> 
> In the meantime you would need to add it also to:
> - ioh3420 (Intel root port)
> - xio3130_downstream (Intel switch downstream port)
> - xio3130_upstream (The counterpart of the above, you want the whole
>   switch to be migrated before loading the IOMMU device state)
> - pcie_pci_bridge (for pci devices)
> - pci-pci bridge (if for some reason you have one attached to the pcie_pci_brdge)
> - i82801b11 (dmi-pci bridge, we want to deprecate it bu is there for now)

I'll see whether there is any better way to do this instead of
duplicating, but I'm not really sure about it.

Anyway, this list is helpful.  Thanks Marcel.

-- 
Peter Xu

Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Dr. David Alan Gilbert 6 years, 2 months ago
* Peter Xu (peterx@redhat.com) wrote:
> In the past, we prioritized IOMMU migration so that we have such a
> priority order:
> 
>     IOMMU > PCI Devices
> 
> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> migrate vIOMMU first, since pcie-root-port will be seen to have the same
> priority of general PCI devices.
> 
> That's problematic.
> 
> The thing is that PCI bus number information is stored in the root port,
> and that is needed by vIOMMU during post_load(), e.g., to figure out
> context entry for a device.  If we don't have correct bus numbers for
> devices, we won't be able to recover device state of the DMAR memory
> regions, and things will be messed up.
> 
> So let's boost the PCIe root ports to be even with higher priority:
> 
>    PCIe Root Port > IOMMU > PCI Devices
> 
> A smoke test shows that this patch fixes bug 1538953.

Two questions (partially overlapping with what I replied to Michaels):
  a) What happens with multiple IOMMUs?
  b) What happens with multiple root ports?
  c) How correct is this ordering on different implementations 
    (e.g. ARM/Power/etc)

Dave

> 
> CC: Alex Williamson <alex.williamson@redhat.com>
> CC: Marcel Apfelbaum <marcel@redhat.com>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Laurent Vivier <lvivier@redhat.com>
> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> Marcel & all,
> 
> I think it's possible that we need similar thing for other bridge-like
> devices, but I'm not that familiar.  Would you help confirm?  Thanks,
> ---
>  hw/pci-bridge/gen_pcie_root_port.c | 1 +
>  include/migration/vmstate.h        | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> index 0e2f2e8bf1..e6ff1effd8 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
>  
>  static const VMStateDescription vmstate_rp_dev = {
>      .name = "pcie-root-port",
> +    .priority = MIG_PRI_PCIE_ROOT_PORT,
>      .version_id = 1,
>      .minimum_version_id = 1,
>      .post_load = pcie_cap_slot_post_load,
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 8c3889433c..491449db9f 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -148,6 +148,7 @@ enum VMStateFlags {
>  typedef enum {
>      MIG_PRI_DEFAULT = 0,
>      MIG_PRI_IOMMU,              /* Must happen before PCI devices */
> +    MIG_PRI_PCIE_ROOT_PORT,     /* Must happen before IOMMU */
>      MIG_PRI_GICV3_ITS,          /* Must happen before PCI devices */
>      MIG_PRI_GICV3,              /* Must happen before the ITS */
>      MIG_PRI_MAX,
> -- 
> 2.14.3
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Peter Xu 6 years, 2 months ago
On Thu, Feb 01, 2018 at 07:51:31PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > In the past, we prioritized IOMMU migration so that we have such a
> > priority order:
> > 
> >     IOMMU > PCI Devices
> > 
> > When migrating a guest with both vIOMMU and pcie-root-port, we'll always
> > migrate vIOMMU first, since pcie-root-port will be seen to have the same
> > priority of general PCI devices.
> > 
> > That's problematic.
> > 
> > The thing is that PCI bus number information is stored in the root port,
> > and that is needed by vIOMMU during post_load(), e.g., to figure out
> > context entry for a device.  If we don't have correct bus numbers for
> > devices, we won't be able to recover device state of the DMAR memory
> > regions, and things will be messed up.
> > 
> > So let's boost the PCIe root ports to be even with higher priority:
> > 
> >    PCIe Root Port > IOMMU > PCI Devices
> > 
> > A smoke test shows that this patch fixes bug 1538953.
> 
> Two questions (partially overlapping with what I replied to Michaels):
>   a) What happens with multiple IOMMUs?

If there are more IOMMUs, then the patch will let all the vIOMMUs be
migrated after pcie root ports.

But a more true answer is that: I don't really know. :)

Because I even don't know how multiple vIOMMUs will coop with each
other, especially nested.  In nested case, maybe there will be
dependency between vIOMMUs, but I'll avoid thinking about that until
we support more than one vIOMMUs.

>   b) What happens with multiple root ports?

Same answer as previous one: all of them will be migrated before any
vIOMMUs.

Note that IMHO we don't care which pcie root port is migrated first -
IMHO they should not depend on each other, but Marcel may correct me.

>   c) How correct is this ordering on different implementations 
>     (e.g. ARM/Power/etc)

Currently it won't affect since Intel IOMMU is the only user for
MIG_PRI_IOMMU.  After SMMU is merged it may affect (if it uses this
bit), but IMHO it's fine too as long as pcie root ports won't depend
on anything related to SMMU.

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH] pcie-root-port: let it has higher migrate priority
Posted by Marcel Apfelbaum 6 years, 2 months ago
On 02/02/2018 11:56, Peter Xu wrote:
> On Thu, Feb 01, 2018 at 07:51:31PM +0000, Dr. David Alan Gilbert wrote:
>> * Peter Xu (peterx@redhat.com) wrote:
>>> In the past, we prioritized IOMMU migration so that we have such a
>>> priority order:
>>>
>>>     IOMMU > PCI Devices
>>>
>>> When migrating a guest with both vIOMMU and pcie-root-port, we'll always
>>> migrate vIOMMU first, since pcie-root-port will be seen to have the same
>>> priority of general PCI devices.
>>>
>>> That's problematic.
>>>
>>> The thing is that PCI bus number information is stored in the root port,
>>> and that is needed by vIOMMU during post_load(), e.g., to figure out
>>> context entry for a device.  If we don't have correct bus numbers for
>>> devices, we won't be able to recover device state of the DMAR memory
>>> regions, and things will be messed up.
>>>
>>> So let's boost the PCIe root ports to be even with higher priority:
>>>
>>>    PCIe Root Port > IOMMU > PCI Devices
>>>
>>> A smoke test shows that this patch fixes bug 1538953.
>>
>> Two questions (partially overlapping with what I replied to Michaels):
>>   a) What happens with multiple IOMMUs?
> 
> If there are more IOMMUs, then the patch will let all the vIOMMUs be
> migrated after pcie root ports.
> 
> But a more true answer is that: I don't really know. :)
> 
> Because I even don't know how multiple vIOMMUs will coop with each
> other, especially nested. 

I am not aware of "nested" IOMMUs. Multiple IOMMUs work together
by dividing the bus ranges, when each of them declares in the
corresponding ACPI table the bus/device/range is in charge of.

However there was a kernel bug some time ago preventing several
IOMMUs to work together, I am not sure the problem is solved yet.


 In nested case, maybe there will be
> dependency between vIOMMUs, but I'll avoid thinking about that until
> we support more than one vIOMMUs.
> 
>>   b) What happens with multiple root ports?
> 
> Same answer as previous one: all of them will be migrated before any
> vIOMMUs.
> 
> Note that IMHO we don't care which pcie root port is migrated first -
> IMHO they should not depend on each other, but Marcel may correct me.
> 

Right, each Root Port is independent from each other.

Thanks,
Marcel

>>   c) How correct is this ordering on different implementations 
>>     (e.g. ARM/Power/etc)
> 
> Currently it won't affect since Intel IOMMU is the only user for
> MIG_PRI_IOMMU.  After SMMU is merged it may affect (if it uses this
> bit), but IMHO it's fine too as long as pcie root ports won't depend
> on anything related to SMMU.
> 
> Thanks,
>