include/hw/virtio/vhost-vdpa.h | 36 +++++--- hw/virtio/vdpa-dev.c | 7 +- hw/virtio/vhost-vdpa.c | 160 +++++++++++++++++---------------- net/vhost-vdpa.c | 117 ++++++++++++------------ hw/virtio/trace-events | 14 +-- 5 files changed, 174 insertions(+), 160 deletions(-)
Current memory operations like pinning may take a lot of time at the destination. Currently they are done after the source of the migration is stopped, and before the workload is resumed at the destination. This is a period where neigher traffic can flow, nor the VM workload can continue (downtime). We can do better as we know the memory layout of the guest RAM at the destination from the moment the migration starts. Moving that operation allows QEMU to communicate the kernel the maps while the workload is still running in the source, so Linux can start mapping them. Ideally, all IOMMU is configured, but if the vDPA parent driver uses on-chip IOMMU and .set_map we're still saving all the pinning time. This is a first required step to consolidate all the members in a common struct. This is needed because the destination does not know what vhost_vdpa struct will have the registered listener member, so it is easier to place them in a shared struct rather to keep them in vhost_vdpa struct. v1 from RFC: * Fix vhost_vdpa_net_cvq_start checking for always_svq instead of shadow_data. This could cause CVQ not being shadowed if vhost_vdpa_net_cvq_start was called in the middle of a migration. Eugenio Pérez (13): vdpa: add VhostVDPAShared vdpa: move iova tree to the shared struct vdpa: move iova_range to vhost_vdpa_shared vdpa: move shadow_data to vhost_vdpa_shared vdpa: use vdpa shared for tracing vdpa: move file descriptor to vhost_vdpa_shared vdpa: move iotlb_batch_begin_sent to vhost_vdpa_shared vdpa: move backend_cap to vhost_vdpa_shared vdpa: remove msg type of vhost_vdpa vdpa: move iommu_list to vhost_vdpa_shared vdpa: use VhostVDPAShared in vdpa_dma_map and unmap vdpa: use dev_shared in vdpa_iommu vdpa: move memory listener to vhost_vdpa_shared include/hw/virtio/vhost-vdpa.h | 36 +++++--- hw/virtio/vdpa-dev.c | 7 +- hw/virtio/vhost-vdpa.c | 160 +++++++++++++++++---------------- net/vhost-vdpa.c | 117 ++++++++++++------------ hw/virtio/trace-events | 14 +-- 5 files changed, 174 insertions(+), 160 deletions(-) -- 2.39.3
On Sat, Nov 25, 2023 at 1:14 AM Eugenio Pérez <eperezma@redhat.com> wrote: > > Current memory operations like pinning may take a lot of time at the > destination. Currently they are done after the source of the migration is > stopped, and before the workload is resumed at the destination. This is a > period where neigher traffic can flow, nor the VM workload can continue > (downtime). > > We can do better as we know the memory layout of the guest RAM at the > destination from the moment the migration starts. Moving that operation allows > QEMU to communicate the kernel the maps while the workload is still running in > the source, so Linux can start mapping them. Ideally, all IOMMU is configured, > but if the vDPA parent driver uses on-chip IOMMU and .set_map we're still > saving all the pinning time. > > This is a first required step to consolidate all the members in a common > struct. This is needed because the destination does not know what vhost_vdpa > struct will have the registered listener member, so it is easier to place them > in a shared struct rather to keep them in vhost_vdpa struct. > > v1 from RFC: > * Fix vhost_vdpa_net_cvq_start checking for always_svq instead of > shadow_data. This could cause CVQ not being shadowed if > vhost_vdpa_net_cvq_start was called in the middle of a migration. With the renaming of the VhostVDPAShared to VhostVDPAParent. Acked-by: Jason Wang <jasowang@redhat.com> Thanks > > Eugenio Pérez (13): > vdpa: add VhostVDPAShared > vdpa: move iova tree to the shared struct > vdpa: move iova_range to vhost_vdpa_shared > vdpa: move shadow_data to vhost_vdpa_shared > vdpa: use vdpa shared for tracing > vdpa: move file descriptor to vhost_vdpa_shared > vdpa: move iotlb_batch_begin_sent to vhost_vdpa_shared > vdpa: move backend_cap to vhost_vdpa_shared > vdpa: remove msg type of vhost_vdpa > vdpa: move iommu_list to vhost_vdpa_shared > vdpa: use VhostVDPAShared in vdpa_dma_map and unmap > vdpa: use dev_shared in vdpa_iommu > vdpa: move memory listener to vhost_vdpa_shared > > include/hw/virtio/vhost-vdpa.h | 36 +++++--- > hw/virtio/vdpa-dev.c | 7 +- > hw/virtio/vhost-vdpa.c | 160 +++++++++++++++++---------------- > net/vhost-vdpa.c | 117 ++++++++++++------------ > hw/virtio/trace-events | 14 +-- > 5 files changed, 174 insertions(+), 160 deletions(-) > > -- > 2.39.3 > >
Hi Eugenio QE performed regression testing after applying this patch. This series patch introduced a qemu core dump bug, for the core dump information please review the attached file. Tested-by: Lei Yang <leiyang@redhat.com> On Sat, Nov 25, 2023 at 1:14 AM Eugenio Pérez <eperezma@redhat.com> wrote: > > Current memory operations like pinning may take a lot of time at the > destination. Currently they are done after the source of the migration is > stopped, and before the workload is resumed at the destination. This is a > period where neigher traffic can flow, nor the VM workload can continue > (downtime). > > We can do better as we know the memory layout of the guest RAM at the > destination from the moment the migration starts. Moving that operation allows > QEMU to communicate the kernel the maps while the workload is still running in > the source, so Linux can start mapping them. Ideally, all IOMMU is configured, > but if the vDPA parent driver uses on-chip IOMMU and .set_map we're still > saving all the pinning time. > > This is a first required step to consolidate all the members in a common > struct. This is needed because the destination does not know what vhost_vdpa > struct will have the registered listener member, so it is easier to place them > in a shared struct rather to keep them in vhost_vdpa struct. > > v1 from RFC: > * Fix vhost_vdpa_net_cvq_start checking for always_svq instead of > shadow_data. This could cause CVQ not being shadowed if > vhost_vdpa_net_cvq_start was called in the middle of a migration. > > Eugenio Pérez (13): > vdpa: add VhostVDPAShared > vdpa: move iova tree to the shared struct > vdpa: move iova_range to vhost_vdpa_shared > vdpa: move shadow_data to vhost_vdpa_shared > vdpa: use vdpa shared for tracing > vdpa: move file descriptor to vhost_vdpa_shared > vdpa: move iotlb_batch_begin_sent to vhost_vdpa_shared > vdpa: move backend_cap to vhost_vdpa_shared > vdpa: remove msg type of vhost_vdpa > vdpa: move iommu_list to vhost_vdpa_shared > vdpa: use VhostVDPAShared in vdpa_dma_map and unmap > vdpa: use dev_shared in vdpa_iommu > vdpa: move memory listener to vhost_vdpa_shared > > include/hw/virtio/vhost-vdpa.h | 36 +++++--- > hw/virtio/vdpa-dev.c | 7 +- > hw/virtio/vhost-vdpa.c | 160 +++++++++++++++++---------------- > net/vhost-vdpa.c | 117 ++++++++++++------------ > hw/virtio/trace-events | 14 +-- > 5 files changed, 174 insertions(+), 160 deletions(-) > > -- > 2.39.3 > > (gdb) bt full #0 memory_listener_unregister (listener=0x55a1ed7e2d98) at ../system/memory.c:3123 #1 0x000055a1e8ae5eba in vhost_vdpa_cleanup (dev=0x55a1ebe60470) at ../hw/virtio/vhost-vdpa.c:751 v = 0x55a1ecfd7f90 __PRETTY_FUNCTION__ = "vhost_vdpa_cleanup" #2 0x000055a1e8addf7b in vhost_dev_cleanup (hdev=0x55a1ebe60470) at ../hw/virtio/vhost.c:1603 i = 1 __PRETTY_FUNCTION__ = "vhost_dev_cleanup" #3 0x000055a1e89b7d26 in vhost_net_cleanup (net=0x55a1ebe60470) at ../hw/net/vhost_net.c:469 #4 0x000055a1e8b97976 in vhost_vdpa_cleanup (nc=0x55a1ecfd7e10) at ../net/vhost-vdpa.c:235 s = 0x55a1ecfd7e10 #5 0x000055a1e8b82aab in qemu_cleanup_net_client (nc=0x55a1ecfd7e10) at ../net/net.c:383 #6 0x000055a1e8b82d7b in qemu_del_net_client (nc=0x55a1ed52e260) at ../net/net.c:446 ncs = {0x55a1ed52e260, 0x55a1ecfd7e10, 0x200000002, 0x55a1ed5e8b18, 0x55a1ed5e8b18, 0x55a1ed5e8c90, 0x0 <repeats 45 times>, 0x7f32e874f45f <__libc_connect+95>, 0x7ffdf7d70130, 0x6effffffff, 0x0, 0x7f32e879274e <open_socket+606>, 0x0, 0x0, 0x0, 0x7f32e8792530 <open_socket+64>, 0x0, 0xd, 0x0, 0x0, 0x0, 0x0, 0x722f7261762f0001, 0x2f6463736e2f6e75, 0x74656b636f73, 0x0 <repeats 12 times>, 0x571484d67a787b00, 0x0, 0xfffffffffffffb08, 0xd, 0x5, 0x7f32e87bbda8, 0xffffffffffffffff, 0x7ffdf7d703a0, 0x7f32e8792b1d <__nscd_get_mapping+189>, 0x0, 0x7f32e8792a9c <__nscd_get_mapping+60>, 0x0, 0x0, 0x7ffdf7d701f0, 0x7f32e880dd68 <__hst_map_handle+8>, 0x200000000, 0x6, 0x0 <repeats 11 times>, 0x7f32e86ae3b0 <malloc_consolidate+368>, 0x0, 0x7f32e87ffce0 <main_arena+96>, 0x0, 0x0, 0x0, 0x0, 0x0, 0x620, 0x618, 0x62, 0x618, 0x6e, 0x7c, 0x7f32e86b0a81 <_int_malloc+3281>, 0x7f32e87ffc80 <main_arena>, 0x60, 0x0, 0x7, 0x618, 0x7107d30e5b873e01, 0x48, 0x70, 0x640, 0x18, 0x4800000062, 0x0, 0x0, 0x571484d67a787b00, 0x770000007c, 0x0, 0x0, 0x7ffdf7d714a0, 0xf, 0x55a1ec3750a0, 0x7ffdf7d71530, 0x7f32e87909c6 <__nscd_get_nl_timestamp+214>, 0x16ca, 0x571484d67a787b00, 0x7ffdf7d71530, 0x571484d67a787b00, 0x14, 0x7ffdf7d71530, 0x0, 0x7f32e876a3a3 <__check_pf+1523>, 0x2000300000014, 0x16ca6567f272, 0x100000000, 0x10014, 0x0, 0x6001401000000, 0xffffffffffffffff, 0x29100000291, 0x8000080008, 0x2001400000048, 0x16ca6567f272, 0x60000400a, 0x5200202600010014, 0x10bfe58788480000, 0x6001435492812, 0x278cdb00093a5b, 0x3b6d700000a89, 0x20000080008, 0x2001400000048, 0x16ca6567f272, 0x6fd80400a, 0x80fe00010014, 0x1eafba9a00000000, 0x600142fddac91, 0xffffffffffffffff, 0x9ff000009ff, 0x28000080008, 0x2001400000048, 0x16ca6567f272, 0x16fd80400a, 0x80fe00010014, 0xfff6ceba00000000, 0x60014f0110afe, 0xffffffffffffffff, 0xda900000da90, 0x8000080008, 0x0 <repeats 83 times>, 0x7f32e86b0a81 <_int_malloc+3281>, 0x7f32e87ffc80 <main_arena>, 0x13f, 0x0, 0x7, 0x1401, 0x7107d30e5b873e01, 0x65, 0x7f32e86af3be <_int_free+2030>, 0x1430, 0x50, 0x8000000141, 0x100000000, 0x1, 0xa, 0x770000007c, 0x571484d67a787b00, 0x0, 0x17b0, 0xa0, 0x0, 0x81a00000, 0x0, 0xa0, 0x0, 0x81a00000, 0x0, 0x81a00100, 0x7ffdf7d70ab0, 0x81a00000, 0x0, 0x81a000a0, 0x0, 0x7ffdf7d70920, 0x55a1e8d3cd42 <addrrange_intersection+226>, 0x7ffdf7d708e0, 0x7ffdf7d70ab0, 0x81a00000...} queues = 2 i = 1 nf = 0x0 next = 0x7ffdf7d71f88 __PRETTY_FUNCTION__ = "qemu_del_net_client" #7 0x000055a1e8b84a9f in qmp_netdev_del (id=0x55a1ed18c770 "idqT0hia", errp=0x7ffdf7d71f98) at ../net/net.c:1325 nc = 0x55a1ed52e260 opts = 0x55a1ec2dc020 __func__ = "qmp_netdev_del" #8 0x000055a1e8f5d7b8 in qmp_marshal_netdev_del (args=0x7f32dc0058f0, ret=0x7f32e7c34d98, errp=0x7f32e7c34d90) at qapi/qapi-commands-net.c:135 err = 0x0 ok = true v = 0x55a1ec2dc020 arg = {id = 0x55a1ed18c770 "idqT0hia"} #9 0x000055a1e8f9b241 in do_qmp_dispatch_bh (opaque=0x7f32e7c34e30) at ../qapi/qmp-dispatch.c:128 data = 0x7f32e7c34e30 __PRETTY_FUNCTION__ = "do_qmp_dispatch_bh" #10 0x000055a1e8fc58e9 in aio_bh_call (bh=0x7f32dc004150) at ../util/async.c:169 last_engaged_in_io = false reentrancy_guard = 0x0 #11 0x000055a1e8fc5a04 in aio_bh_poll (ctx=0x55a1ebed6fb0) at ../util/async.c:216 bh = 0x7f32dc004150 flags = 11 slice = {bh_list = {slh_first = 0x0}, next = {sqe_next = 0x0}} s = 0x7ffdf7d72090 ret = 1 #12 0x000055a1e8fa85af in aio_dispatch (ctx=0x55a1ebed6fb0) at ../util/aio-posix.c:423 #13 0x000055a1e8fc5e43 in aio_ctx_dispatch (source=0x55a1ebed6fb0, callback=0x0, user_data=0x0) at ../util/async.c:358 ctx = 0x55a1ebed6fb0 __PRETTY_FUNCTION__ = "aio_ctx_dispatch" #14 0x00007f32e8a7ef4f in g_main_dispatch (context=0x55a1ebed9520) at ../glib/gmain.c:3364 dispatch = 0x55a1e8fc5dec <aio_ctx_dispatch> prev_source = 0x0 begin_time_nsec = 2687834462600 was_in_call = <optimized out> user_data = 0x0 callback = 0x0 cb_funcs = 0x0 cb_data = 0x0 need_destroy = <optimized out> source = 0x55a1ebed6fb0 current = 0x55a1ec33ffa0 i = 0 #15 g_main_context_dispatch (context=0x55a1ebed9520) at ../glib/gmain.c:4079 #16 0x000055a1e8fc73ae in glib_pollfds_poll () at ../util/main-loop.c:290 context = 0x55a1ebed9520 pfds = 0x55a1ec3ece00 #17 0x000055a1e8fc742b in os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:313 context = 0x55a1ebed9520 ret = 1 #18 0x000055a1e8fc7539 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:592 mlpoll = {state = 0, timeout = 4294967295, pollfds = 0x55a1ebebe860} ret = 32765 timeout_ns = 798542174 #19 0x000055a1e8b22d27 in qemu_main_loop () at ../system/runstate.c:782 status = 0 #20 0x000055a1e881b706 in qemu_default_main () at ../system/main.c:37 status = 0 #21 0x000055a1e881b741 in main (argc=84, argv=0x7ffdf7d723c8) at ../system/main.c:48
On Thu, Nov 30, 2023 at 4:22 AM Lei Yang <leiyang@redhat.com> wrote: > > Hi Eugenio > > QE performed regression testing after applying this patch. This series > patch introduced a qemu core dump bug, for the core dump information > please review the attached file. > Hi Lei, thank you very much for the testing! Can you describe the test steps that lead to the crash? I think you removed the vdpa device via QMP, but I'd like to be sure. Thanks! > Tested-by: Lei Yang <leiyang@redhat.com> > > > > > On Sat, Nov 25, 2023 at 1:14 AM Eugenio Pérez <eperezma@redhat.com> wrote: > > > > Current memory operations like pinning may take a lot of time at the > > destination. Currently they are done after the source of the migration is > > stopped, and before the workload is resumed at the destination. This is a > > period where neigher traffic can flow, nor the VM workload can continue > > (downtime). > > > > We can do better as we know the memory layout of the guest RAM at the > > destination from the moment the migration starts. Moving that operation allows > > QEMU to communicate the kernel the maps while the workload is still running in > > the source, so Linux can start mapping them. Ideally, all IOMMU is configured, > > but if the vDPA parent driver uses on-chip IOMMU and .set_map we're still > > saving all the pinning time. > > > > This is a first required step to consolidate all the members in a common > > struct. This is needed because the destination does not know what vhost_vdpa > > struct will have the registered listener member, so it is easier to place them > > in a shared struct rather to keep them in vhost_vdpa struct. > > > > v1 from RFC: > > * Fix vhost_vdpa_net_cvq_start checking for always_svq instead of > > shadow_data. This could cause CVQ not being shadowed if > > vhost_vdpa_net_cvq_start was called in the middle of a migration. > > > > Eugenio Pérez (13): > > vdpa: add VhostVDPAShared > > vdpa: move iova tree to the shared struct > > vdpa: move iova_range to vhost_vdpa_shared > > vdpa: move shadow_data to vhost_vdpa_shared > > vdpa: use vdpa shared for tracing > > vdpa: move file descriptor to vhost_vdpa_shared > > vdpa: move iotlb_batch_begin_sent to vhost_vdpa_shared > > vdpa: move backend_cap to vhost_vdpa_shared > > vdpa: remove msg type of vhost_vdpa > > vdpa: move iommu_list to vhost_vdpa_shared > > vdpa: use VhostVDPAShared in vdpa_dma_map and unmap > > vdpa: use dev_shared in vdpa_iommu > > vdpa: move memory listener to vhost_vdpa_shared > > > > include/hw/virtio/vhost-vdpa.h | 36 +++++--- > > hw/virtio/vdpa-dev.c | 7 +- > > hw/virtio/vhost-vdpa.c | 160 +++++++++++++++++---------------- > > net/vhost-vdpa.c | 117 ++++++++++++------------ > > hw/virtio/trace-events | 14 +-- > > 5 files changed, 174 insertions(+), 160 deletions(-) > > > > -- > > 2.39.3 > > > >
On Thu, Nov 30, 2023 at 3:38 PM Eugenio Perez Martin <eperezma@redhat.com> wrote: > > On Thu, Nov 30, 2023 at 4:22 AM Lei Yang <leiyang@redhat.com> wrote: > > > > Hi Eugenio > > > > QE performed regression testing after applying this patch. This series > > patch introduced a qemu core dump bug, for the core dump information > > please review the attached file. > > > > Hi Lei, thank you very much for the testing! > Hi Eugenio > Can you describe the test steps that lead to the crash? I think you > removed the vdpa device via QMP, but I'd like to be sure. Yes, you're right, the core dump occurs when hot unplug nic, please review the following simple test steps: Test Steps: 1. create two vdpa device(vdpa0 and vdpa1) with multi queues 2. Boot a guest with vdpa0 3. set_link false to vdpa0 4. hotplug vdpa1 5. stop and resume guest via QMP 6. hotunplug vdpa1, hit core dump in this time Thanks Lei > > Thanks! > > > Tested-by: Lei Yang <leiyang@redhat.com> > > > > > > > > > > On Sat, Nov 25, 2023 at 1:14 AM Eugenio Pérez <eperezma@redhat.com> wrote: > > > > > > Current memory operations like pinning may take a lot of time at the > > > destination. Currently they are done after the source of the migration is > > > stopped, and before the workload is resumed at the destination. This is a > > > period where neigher traffic can flow, nor the VM workload can continue > > > (downtime). > > > > > > We can do better as we know the memory layout of the guest RAM at the > > > destination from the moment the migration starts. Moving that operation allows > > > QEMU to communicate the kernel the maps while the workload is still running in > > > the source, so Linux can start mapping them. Ideally, all IOMMU is configured, > > > but if the vDPA parent driver uses on-chip IOMMU and .set_map we're still > > > saving all the pinning time. > > > > > > This is a first required step to consolidate all the members in a common > > > struct. This is needed because the destination does not know what vhost_vdpa > > > struct will have the registered listener member, so it is easier to place them > > > in a shared struct rather to keep them in vhost_vdpa struct. > > > > > > v1 from RFC: > > > * Fix vhost_vdpa_net_cvq_start checking for always_svq instead of > > > shadow_data. This could cause CVQ not being shadowed if > > > vhost_vdpa_net_cvq_start was called in the middle of a migration. > > > > > > Eugenio Pérez (13): > > > vdpa: add VhostVDPAShared > > > vdpa: move iova tree to the shared struct > > > vdpa: move iova_range to vhost_vdpa_shared > > > vdpa: move shadow_data to vhost_vdpa_shared > > > vdpa: use vdpa shared for tracing > > > vdpa: move file descriptor to vhost_vdpa_shared > > > vdpa: move iotlb_batch_begin_sent to vhost_vdpa_shared > > > vdpa: move backend_cap to vhost_vdpa_shared > > > vdpa: remove msg type of vhost_vdpa > > > vdpa: move iommu_list to vhost_vdpa_shared > > > vdpa: use VhostVDPAShared in vdpa_dma_map and unmap > > > vdpa: use dev_shared in vdpa_iommu > > > vdpa: move memory listener to vhost_vdpa_shared > > > > > > include/hw/virtio/vhost-vdpa.h | 36 +++++--- > > > hw/virtio/vdpa-dev.c | 7 +- > > > hw/virtio/vhost-vdpa.c | 160 +++++++++++++++++---------------- > > > net/vhost-vdpa.c | 117 ++++++++++++------------ > > > hw/virtio/trace-events | 14 +-- > > > 5 files changed, 174 insertions(+), 160 deletions(-) > > > > > > -- > > > 2.39.3 > > > > > > >
© 2016 - 2024 Red Hat, Inc.