At the moment VFIO PCI device initialization works as follows:
vfio_realize
vfio_get_group
vfio_connect_container
register memory listeners (1)
update QEMU groups lists
vfio_kvm_device_add_group
Then (example for pseries) the machine reset hook triggers region_add()
for all regions where listeners from (1) are listening:
ppc_spapr_reset
spapr_phb_reset
spapr_tce_table_enable
memory_region_add_subregion
vfio_listener_region_add
vfio_spapr_create_window
This scheme works fine until we need to handle VFIO PCI device hotplug
_and_ we want to enable in-kernel acceleration on, i.e. after PCI hotplug
we need a place to call
ioctl(vfio_kvm_device_fd, KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE).
Since the ioctl needs a LIOBN fd (from sPAPRTCETable) and a IOMMU group fd
(from VFIOGroup), vfio_listener_region_add() seems to be the only place
for this ioctl().
However this only works during boot time because the machine reset
happens strictly after all devices are finalized. When hotplug happens,
vfio_listener_region_add() is called when a memory listener is registered
but when this happens:
1. new group is not added to the container->group_list yet;
2. VFIO KVM device is unaware of the new IOMMU group.
This moves bits around to have all necessary VFIO infrastructure
in place for both initial startup and hotplug cases.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v2:
* moved container->initialized back to its correct location
* added missing QLIST_REMOVE()
---
hw/vfio/common.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index ab95db689c..e8188eb3d5 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1087,6 +1087,14 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
goto free_container_exit;
}
+ vfio_kvm_device_add_group(group);
+
+ QLIST_INIT(&container->group_list);
+ QLIST_INSERT_HEAD(&space->containers, container, next);
+
+ group->container = container;
+ QLIST_INSERT_HEAD(&container->group_list, group, container_next);
+
container->listener = vfio_memory_listener;
memory_listener_register(&container->listener, container->space->as);
@@ -1100,14 +1108,11 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
container->initialized = true;
- QLIST_INIT(&container->group_list);
- QLIST_INSERT_HEAD(&space->containers, container, next);
-
- group->container = container;
- QLIST_INSERT_HEAD(&container->group_list, group, container_next);
-
return 0;
listener_release_exit:
+ QLIST_REMOVE(group, container_next);
+ QLIST_REMOVE(container, next);
+ vfio_kvm_device_del_group(group);
vfio_listener_release(container);
free_container_exit:
@@ -1212,8 +1217,6 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp)
QLIST_INSERT_HEAD(&vfio_group_list, group, next);
- vfio_kvm_device_add_group(group);
-
return group;
close_fd_exit:
--
2.11.0
On Sat, Apr 01, 2017 at 11:37:39PM +1100, Alexey Kardashevskiy wrote: > At the moment VFIO PCI device initialization works as follows: > vfio_realize > vfio_get_group > vfio_connect_container > register memory listeners (1) > update QEMU groups lists > vfio_kvm_device_add_group > > Then (example for pseries) the machine reset hook triggers region_add() > for all regions where listeners from (1) are listening: > > ppc_spapr_reset > spapr_phb_reset > spapr_tce_table_enable > memory_region_add_subregion > vfio_listener_region_add > vfio_spapr_create_window > > This scheme works fine until we need to handle VFIO PCI device hotplug > _and_ we want to enable in-kernel acceleration on, i.e. after PCI hotplug > we need a place to call > ioctl(vfio_kvm_device_fd, KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE). > Since the ioctl needs a LIOBN fd (from sPAPRTCETable) and a IOMMU group fd > (from VFIOGroup), vfio_listener_region_add() seems to be the only place > for this ioctl(). > > However this only works during boot time because the machine reset > happens strictly after all devices are finalized. When hotplug happens, > vfio_listener_region_add() is called when a memory listener is registered > but when this happens: > 1. new group is not added to the container->group_list yet; > 2. VFIO KVM device is unaware of the new IOMMU group. > > This moves bits around to have all necessary VFIO infrastructure > in place for both initial startup and hotplug cases. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > Changes: > v2: > * moved container->initialized back to its correct location > * added missing QLIST_REMOVE() > --- > hw/vfio/common.c | 19 +++++++++++-------- > 1 file changed, 11 insertions(+), 8 deletions(-) > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index ab95db689c..e8188eb3d5 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -1087,6 +1087,14 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, > goto free_container_exit; > } > > + vfio_kvm_device_add_group(group); > + > + QLIST_INIT(&container->group_list); > + QLIST_INSERT_HEAD(&space->containers, container, next); > + > + group->container = container; > + QLIST_INSERT_HEAD(&container->group_list, group, container_next); > + > container->listener = vfio_memory_listener; > > memory_listener_register(&container->listener, container->space->as); > @@ -1100,14 +1108,11 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, > > container->initialized = true; > > - QLIST_INIT(&container->group_list); > - QLIST_INSERT_HEAD(&space->containers, container, next); > - > - group->container = container; > - QLIST_INSERT_HEAD(&container->group_list, group, container_next); > - > return 0; > listener_release_exit: > + QLIST_REMOVE(group, container_next); > + QLIST_REMOVE(container, next); > + vfio_kvm_device_del_group(group); > vfio_listener_release(container); > > free_container_exit: > @@ -1212,8 +1217,6 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp) > > QLIST_INSERT_HEAD(&vfio_group_list, group, next); > > - vfio_kvm_device_add_group(group); > - > return group; > > close_fd_exit: -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
© 2016 - 2026 Red Hat, Inc.