[v1] vhost: fix the IO error after live migration

[PATCH 0/3] vhost: fix the IO error after live migration

Posted by Haoqian He 3 weeks, 3 days ago

At the end of the VM live migration, the vhost device will be stopped.
Currently, if the vhost-user backend crash, vhost device's set_status()
would not return failure, live migration won't perceive the disconnection
with the backend. After the live migration is successful, the stale inflight
IO would be submitted to the migration target host, which may leading to
the IO error.

The following patch series fixes the issue by making the live migration
aware of the lost of connection with the vhost-user backend and aborting
the live migration.

Haoqian He (3):
  virtio: add VM state change cb with return value
  vhost: return failure if stop virtqueue failed in vhost_dev_stop
  vhost-user: return failure if backend crash when live migration

 hw/block/vhost-user-blk.c             | 29 +++++++++++++++------------
 hw/block/virtio-blk.c                 |  2 +-
 hw/core/vm-change-state-handler.c     | 14 +++++++------
 hw/scsi/scsi-bus.c                    |  2 +-
 hw/scsi/vhost-scsi-common.c           | 11 +++++-----
 hw/scsi/vhost-user-scsi.c             | 20 ++++++++++--------
 hw/vfio/migration.c                   |  2 +-
 hw/virtio/vhost.c                     | 27 ++++++++++++++-----------
 hw/virtio/virtio.c                    | 25 ++++++++++++++++-------
 include/hw/virtio/vhost-scsi-common.h |  2 +-
 include/hw/virtio/vhost.h             |  8 +++++---
 include/hw/virtio/virtio.h            |  1 +
 include/system/runstate.h             | 11 +++++++---
 system/cpus.c                         |  4 ++--
 system/runstate.c                     | 25 ++++++++++++++++++-----
 15 files changed, 115 insertions(+), 68 deletions(-)

-- 
2.48.1

[PATCH v2 0/3] vhost: fix the IO error after live migration

Posted by Haoqian He 2 weeks, 5 days ago

At the end of the VM live migration, the vhost device will be stopped.
Currently, if the vhost-user backend crash, vhost device's set_status()
would not return failure, live migration won't perceive the disconnection
with the backend. After the live migration is successful, the stale inflight
IO will be submitted to the migration target host, which may be leading to
the IO error.

The following patch series fixes the issue by making the live migration
aware of the loss of connection with the vhost-user backend and aborting
the live migration.

---
  v1 ... v2
    1. Fix some grammar issues in commit message.
    2. Remove assert in vhost_scsi_common_stop and return error upwards.

Haoqian He (3):
  virtio: add VM state change cb with return value
  vhost: return failure if stop virtqueue failed in vhost_dev_stop
  vhost-user: return failure if backend crash when live migration

 hw/block/vhost-user-blk.c             | 29 +++++++++++++++------------
 hw/block/virtio-blk.c                 |  2 +-
 hw/core/vm-change-state-handler.c     | 14 +++++++------
 hw/scsi/scsi-bus.c                    |  2 +-
 hw/scsi/vhost-scsi-common.c           | 13 ++++++------
 hw/scsi/vhost-user-scsi.c             | 20 ++++++++++--------
 hw/vfio/migration.c                   |  2 +-
 hw/virtio/vhost.c                     | 27 ++++++++++++++-----------
 hw/virtio/virtio.c                    | 25 ++++++++++++++++-------
 include/hw/virtio/vhost-scsi-common.h |  2 +-
 include/hw/virtio/vhost.h             |  8 +++++---
 include/hw/virtio/virtio.h            |  1 +
 include/system/runstate.h             | 11 +++++++---
 system/cpus.c                         |  4 ++--
 system/runstate.c                     | 25 ++++++++++++++++++-----
 15 files changed, 116 insertions(+), 69 deletions(-)

-- 
2.48.1

[PATCH v3 0/3] vhost: fix the IO error after live migration

Posted by Haoqian He 6 days, 6 hours ago

At the end of the VM live migration, the vhost device will be stopped.
Currently, if the vhost-user backend crashes, vhost device's set_status()
would not return failure, live migration won't perceive the disconnection
with the backend. After the live migration is successful, the stale inflight
IO will be submitted to the migration target host, which may be leading to
the IO error.

The following patch series fixes the issue by making the live migration
aware of the loss of connection with the vhost-user backend and aborting
the live migration.

---
  v1 ... v2
    1. Fix some grammar issues in commit message.
    2. Remove assert in vhost_scsi_common_stop and return error upwards.

  v2 ... v3
    1. Added more detailed comments and commit message.
    2. Change the newly added type name and parameter name.
    3. Remove set_status_ext, change the return type of set_status to int.

Haoqian He (3):
  system/runstate: add VM state change cb with return value
  vhost: return failure if stop virtqueue failed in vhost_dev_stop
  vhost-user: return failure if backend crash when live migration

 backends/vhost-user.c                  | 20 +++++++--------
 hw/block/vhost-user-blk.c              | 27 +++++++++++---------
 hw/block/virtio-blk.c                  |  7 +++---
 hw/char/virtio-serial-bus.c            |  3 ++-
 hw/core/vm-change-state-handler.c      | 18 ++++++++-----
 hw/display/vhost-user-gpu.c            | 12 ++++++---
 hw/input/virtio-input.c                |  3 ++-
 hw/net/virtio-net.c                    |  3 ++-
 hw/scsi/scsi-bus.c                     |  2 +-
 hw/scsi/vhost-scsi-common.c            | 13 +++++-----
 hw/scsi/vhost-scsi.c                   |  5 ++--
 hw/scsi/vhost-user-scsi.c              | 18 +++++++------
 hw/vfio/migration.c                    |  2 +-
 hw/virtio/vdpa-dev.c                   |  5 ++--
 hw/virtio/vhost-user-base.c            | 23 ++++++++++-------
 hw/virtio/vhost-user-fs.c              | 23 ++++++++++-------
 hw/virtio/vhost-user-scmi.c            | 27 ++++++++++++--------
 hw/virtio/vhost-user-vsock.c           | 15 +++++++----
 hw/virtio/vhost-vsock-common.c         | 12 ++++-----
 hw/virtio/vhost-vsock.c                | 11 ++++----
 hw/virtio/vhost.c                      | 23 +++++++++--------
 hw/virtio/virtio-balloon.c             |  3 ++-
 hw/virtio/virtio-crypto.c              |  3 ++-
 hw/virtio/virtio-iommu.c               |  3 ++-
 hw/virtio/virtio-rng.c                 |  5 ++--
 hw/virtio/virtio.c                     | 23 +++++++++++------
 include/hw/virtio/vhost-scsi-common.h  |  2 +-
 include/hw/virtio/vhost-vsock-common.h |  2 +-
 include/hw/virtio/vhost.h              |  8 +++---
 include/hw/virtio/virtio.h             |  2 +-
 include/system/runstate.h              | 13 +++++++---
 include/system/vhost-user-backend.h    |  2 +-
 system/cpus.c                          |  8 ++++--
 system/runstate.c                      | 35 ++++++++++++++++++++++----
 34 files changed, 239 insertions(+), 142 deletions(-)

-- 
2.44.0

[PATCH v4 0/3] vhost: fix the IO error after live migration

Posted by Haoqian He 22 hours ago

At the end of the VM live migration, the vhost device will be stopped.
Currently, if the vhost-user backend crashes, vhost device's set_status()
would not return failure, live migration won't perceive the disconnection
with the backend. After the live migration is successful, the stale inflight
IO will be submitted to the migration target host, which may be leading to
the IO error.

The following patch series fixes the issue by making the live migration
aware of the loss of connection with the vhost-user backend and aborting
the live migration.

---
  v1 ... v2
    1. Fix some grammar issues in commit message.
    2. Remove assert in vhost_scsi_common_stop and return error upwards.

  v2 ... v3
    1. Added more detailed comments and commit message.
    2. Change the newly added type name and parameter name.
    3. Remove set_status_ext, change the return type of set_status to int.

  v3 ... v4
    1. Call set_status() only if the function pointer is not NULL in the
       3rd patch.
    2. Add the more detailed commit messages for the 3rd patch.

Haoqian He (3):
  system/runstate: add VM state change cb with return value
  vhost: return failure if stop virtqueue failed in vhost_dev_stop
  vhost-user: return failure if backend crash when live migration

 backends/vhost-user.c                  | 20 +++++++--------
 hw/block/vhost-user-blk.c              | 27 +++++++++++---------
 hw/block/virtio-blk.c                  |  7 +++---
 hw/char/virtio-serial-bus.c            |  3 ++-
 hw/core/vm-change-state-handler.c      | 18 ++++++++-----
 hw/display/vhost-user-gpu.c            | 12 ++++++---
 hw/input/virtio-input.c                |  3 ++-
 hw/net/virtio-net.c                    |  3 ++-
 hw/scsi/scsi-bus.c                     |  2 +-
 hw/scsi/vhost-scsi-common.c            | 13 +++++-----
 hw/scsi/vhost-scsi.c                   |  5 ++--
 hw/scsi/vhost-user-scsi.c              | 18 +++++++------
 hw/vfio/migration.c                    |  2 +-
 hw/virtio/vdpa-dev.c                   |  5 ++--
 hw/virtio/vhost-user-base.c            | 23 ++++++++++-------
 hw/virtio/vhost-user-fs.c              | 23 ++++++++++-------
 hw/virtio/vhost-user-scmi.c            | 27 ++++++++++++--------
 hw/virtio/vhost-user-vsock.c           | 15 +++++++----
 hw/virtio/vhost-vsock-common.c         | 12 ++++-----
 hw/virtio/vhost-vsock.c                | 11 ++++----
 hw/virtio/vhost.c                      | 23 +++++++++--------
 hw/virtio/virtio-balloon.c             |  3 ++-
 hw/virtio/virtio-crypto.c              |  3 ++-
 hw/virtio/virtio-iommu.c               |  3 ++-
 hw/virtio/virtio-rng.c                 |  5 ++--
 hw/virtio/virtio.c                     | 23 +++++++++++------
 include/hw/virtio/vhost-scsi-common.h  |  2 +-
 include/hw/virtio/vhost-vsock-common.h |  2 +-
 include/hw/virtio/vhost.h              |  8 +++---
 include/hw/virtio/virtio.h             |  2 +-
 include/system/runstate.h              | 13 +++++++---
 include/system/vhost-user-backend.h    |  2 +-
 system/cpus.c                          |  8 ++++--
 system/runstate.c                      | 35 ++++++++++++++++++++++----
 34 files changed, 240 insertions(+), 141 deletions(-)

-- 
2.44.0

[PATCH v4 1/3] system/runstate: add VM state change cb with return value

Posted by Haoqian He 22 hours ago

This patch adds the new VM state change cb type `VMChangeStateHandlerWithRet`,
which has return value for `VMChangeStateEntry`.

Thus, we can register a new VM state change cb with return value for device.
Note that `VMChangeStateHandler` and `VMChangeStateHandlerWithRet` are mutually
exclusive and cannot be provided at the same time.

This patch is the pre patch for 'vhost-user: return failure if backend crashes
when live migration', which makes the live migration aware of the loss of
connection with the vhost-user backend and aborts the live migration.

Virtio device will use VMChangeStateHandlerWithRet.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 hw/block/virtio-blk.c             |  2 +-
 hw/core/vm-change-state-handler.c | 18 ++++++++++------
 hw/scsi/scsi-bus.c                |  2 +-
 hw/vfio/migration.c               |  2 +-
 hw/virtio/virtio.c                |  2 +-
 include/system/runstate.h         | 13 +++++++++---
 system/cpus.c                     |  8 +++++--
 system/runstate.c                 | 35 ++++++++++++++++++++++++++-----
 8 files changed, 62 insertions(+), 20 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 5135b4d8f1..4a48a16790 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1928,7 +1928,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
      * called after ->start_ioeventfd() has already set blk's AioContext.
      */
     s->change =
-        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, s);
+        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, NULL, s);
 
     blk_ram_registrar_init(&s->blk_ram_registrar, s->blk);
     blk_set_dev_ops(s->blk, &virtio_block_ops, s);
diff --git a/hw/core/vm-change-state-handler.c b/hw/core/vm-change-state-handler.c
index 7064995578..99c642b558 100644
--- a/hw/core/vm-change-state-handler.c
+++ b/hw/core/vm-change-state-handler.c
@@ -40,6 +40,7 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
  * qdev_add_vm_change_state_handler:
  * @dev: the device that owns this handler
  * @cb: the callback function to be invoked
+ * @cb_ret: the callback function with return value to be invoked
  * @opaque: user data passed to the callback function
  *
  * This function works like qemu_add_vm_change_state_handler() except callbacks
@@ -50,25 +51,30 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
  * controller's callback is invoked before the children on its bus when the VM
  * starts running.  The order is reversed when the VM stops running.
  *
+ * Note that the parameter `cb` and `cb_ret` are mutually exclusive.
+ *
  * Returns: an entry to be freed with qemu_del_vm_change_state_handler()
  */
 VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
                                                      VMChangeStateHandler *cb,
+                                                     VMChangeStateHandlerWithRet *cb_ret,
                                                      void *opaque)
 {
-    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, opaque);
+    assert(!cb || !cb_ret);
+    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, cb_ret, opaque);
 }
 
 /*
  * Exactly like qdev_add_vm_change_state_handler() but passes a prepare_cb
- * argument too.
+ * and the cb_ret arguments too.
  */
 VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
-    DeviceState *dev, VMChangeStateHandler *cb,
-    VMChangeStateHandler *prepare_cb, void *opaque)
+    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
+    VMChangeStateHandlerWithRet *cb_ret, void *opaque)
 {
     int depth = qdev_get_dev_tree_depth(dev);
 
-    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, opaque,
-                                                      depth);
+    assert(!cb || !cb_ret);
+    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, cb_ret,
+                                                      opaque, depth);
 }
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 7d4546800f..ec098f5f0a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -356,7 +356,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
         return;
     }
     dev->vmsentry = qdev_add_vm_change_state_handler(DEVICE(dev),
-            scsi_dma_restart_cb, dev);
+            scsi_dma_restart_cb, NULL, dev);
 }
 
 static void scsi_qdev_unrealize(DeviceState *qdev)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 416643ddd6..f531db83ea 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -1015,7 +1015,7 @@ static int vfio_migration_init(VFIODevice *vbasedev)
                      vfio_vmstate_change_prepare :
                      NULL;
     migration->vm_state = qdev_add_vm_change_state_handler_full(
-        vbasedev->dev, vfio_vmstate_change, prepare_cb, vbasedev);
+        vbasedev->dev, vfio_vmstate_change, prepare_cb, NULL, vbasedev);
     migration_add_notifier(&migration->migration_state,
                            vfio_migration_state_notifier);
 
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 85110bce37..202a052053 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3489,7 +3489,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
         vdev->config = NULL;
     }
     vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
-            virtio_vmstate_change, vdev);
+            virtio_vmstate_change, NULL, vdev);
     vdev->device_endian = virtio_default_endian();
     vdev->use_guest_notifier_mask = true;
 }
diff --git a/include/system/runstate.h b/include/system/runstate.h
index bffc3719d4..fdd5c4a517 100644
--- a/include/system/runstate.h
+++ b/include/system/runstate.h
@@ -12,6 +12,7 @@ bool runstate_needs_reset(void);
 void runstate_replay_enable(void);
 
 typedef void VMChangeStateHandler(void *opaque, bool running, RunState state);
+typedef int VMChangeStateHandlerWithRet(void *opaque, bool running, RunState state);
 
 VMChangeStateEntry *qemu_add_vm_change_state_handler(VMChangeStateHandler *cb,
                                                      void *opaque);
@@ -20,21 +21,27 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
 VMChangeStateEntry *
 qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
                                            VMChangeStateHandler *prepare_cb,
+                                           VMChangeStateHandlerWithRet *cb_ret,
                                            void *opaque, int priority);
 VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
                                                      VMChangeStateHandler *cb,
+                                                     VMChangeStateHandlerWithRet *cb_ret,
                                                      void *opaque);
 VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
-    DeviceState *dev, VMChangeStateHandler *cb,
-    VMChangeStateHandler *prepare_cb, void *opaque);
+    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
+    VMChangeStateHandlerWithRet *cb_ret, void *opaque);
 void qemu_del_vm_change_state_handler(VMChangeStateEntry *e);
 /**
  * vm_state_notify: Notify the state of the VM
  *
  * @running: whether the VM is running or not.
  * @state: the #RunState of the VM.
+ *
+ * Return the result of the callback which has return value.
+ * If no callback has return value, still return 0 and the
+ * upper layer should not do additional processing.
  */
-void vm_state_notify(bool running, RunState state);
+int vm_state_notify(bool running, RunState state);
 
 static inline bool shutdown_caused_by_guest(ShutdownCause cause)
 {
diff --git a/system/cpus.c b/system/cpus.c
index 2cc5f887ab..d16b0dff98 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -299,14 +299,18 @@ static int do_vm_stop(RunState state, bool send_stop)
         if (oldstate == RUN_STATE_RUNNING) {
             pause_all_vcpus();
         }
-        vm_state_notify(0, state);
+        ret = vm_state_notify(0, state);
         if (send_stop) {
             qapi_event_send_stop();
         }
     }
 
     bdrv_drain_all();
-    ret = bdrv_flush_all();
+    /*
+     * Even if vm_state_notify() return failure,
+     * it would be better to flush as before.
+     */
+    ret |= bdrv_flush_all();
     trace_vm_stop_flush_all(ret);
 
     return ret;
diff --git a/system/runstate.c b/system/runstate.c
index 272801d307..de74d962bc 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -297,6 +297,7 @@ void qemu_system_vmstop_request(RunState state)
 struct VMChangeStateEntry {
     VMChangeStateHandler *cb;
     VMChangeStateHandler *prepare_cb;
+    VMChangeStateHandlerWithRet *cb_ret;
     void *opaque;
     QTAILQ_ENTRY(VMChangeStateEntry) entries;
     int priority;
@@ -320,14 +321,15 @@ static QTAILQ_HEAD(, VMChangeStateEntry) vm_change_state_head =
 VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
         VMChangeStateHandler *cb, void *opaque, int priority)
 {
-    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, opaque,
-                                                      priority);
+    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, NULL,
+                                                      opaque, priority);
 }
 
 /**
  * qemu_add_vm_change_state_handler_prio_full:
  * @cb: the main callback to invoke
  * @prepare_cb: a callback to invoke before the main callback
+ * @cb_ret: the main callback to invoke with return value
  * @opaque: user data passed to the callbacks
  * @priority: low priorities execute first when the vm runs and the reverse is
  *            true when the vm stops
@@ -344,6 +346,7 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
 VMChangeStateEntry *
 qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
                                            VMChangeStateHandler *prepare_cb,
+                                           VMChangeStateHandlerWithRet *cb_ret,
                                            void *opaque, int priority)
 {
     VMChangeStateEntry *e;
@@ -352,6 +355,7 @@ qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
     e = g_malloc0(sizeof(*e));
     e->cb = cb;
     e->prepare_cb = prepare_cb;
+    e->cb_ret = cb_ret;
     e->opaque = opaque;
     e->priority = priority;
 
@@ -379,9 +383,10 @@ void qemu_del_vm_change_state_handler(VMChangeStateEntry *e)
     g_free(e);
 }
 
-void vm_state_notify(bool running, RunState state)
+int vm_state_notify(bool running, RunState state)
 {
     VMChangeStateEntry *e, *next;
+    int ret = 0;
 
     trace_vm_state_notify(running, state, RunState_str(state));
 
@@ -393,7 +398,17 @@ void vm_state_notify(bool running, RunState state)
         }
 
         QTAILQ_FOREACH_SAFE(e, &vm_change_state_head, entries, next) {
-            e->cb(e->opaque, running, state);
+            if (e->cb) {
+                e->cb(e->opaque, running, state);
+            } else if (e->cb_ret) {
+                /*
+                 * Here ignore the return value of cb_ret because
+                 * we only care about the stopping the device during
+                 * the VM live migration to indicate whether the
+                 * connection between qemu and backend is normal.
+                 */
+                e->cb_ret(e->opaque, running, state);
+            }
         }
     } else {
         QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
@@ -403,9 +418,19 @@ void vm_state_notify(bool running, RunState state)
         }
 
         QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
-            e->cb(e->opaque, running, state);
+            if (e->cb) {
+                e->cb(e->opaque, running, state);
+            } else if (e->cb_ret) {
+                /*
+                 * We should execute all registered callbacks even if
+                 * one of them returns failure, otherwise, some cleanup
+                 * work of the device will be skipped.
+                 */
+                ret |= e->cb_ret(e->opaque, running, state);
+            }
         }
     }
+    return ret;
 }
 
 static ShutdownCause reset_requested;
-- 
2.44.0

[PATCH v4 2/3] vhost: return failure if stop virtqueue failed in vhost_dev_stop

Posted by Haoqian He 22 hours ago

This patch captures the error of vhost_virtqueue_stop() in vhost_dev_stop()
and returns the error upward.

Specifically, if QEMU is disconnected from the vhost backend, some actions
in vhost_dev_stop() will fail, such as sending vhost-user messages to the
backend (GET_VRING_BASE, SET_VRING_ENABLE) and vhost_reset_status.

Considering that both set_vring_enable and vhost_reset_status require setting
the specific virtio feature bit, we can capture vhost_virtqueue_stop()'s
error to indicate that QEMU has lost connection with the backend.

This patch is the pre patch for 'vhost-user: return failure if backend crashes
when live migration', which makes the live migration aware of the loss of
connection with the vhost-user backend and aborts the live migration.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 hw/virtio/vhost.c         | 23 +++++++++++++----------
 include/hw/virtio/vhost.h |  8 +++++---
 2 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 6aa72fd434..bdd945c18d 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1368,10 +1368,10 @@ fail_alloc_desc:
     return r;
 }
 
-void vhost_virtqueue_stop(struct vhost_dev *dev,
-                          struct VirtIODevice *vdev,
-                          struct vhost_virtqueue *vq,
-                          unsigned idx)
+int vhost_virtqueue_stop(struct vhost_dev *dev,
+                         struct VirtIODevice *vdev,
+                         struct vhost_virtqueue *vq,
+                         unsigned idx)
 {
     int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
     struct vhost_vring_state state = {
@@ -1381,7 +1381,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
 
     if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
         /* Don't stop the virtqueue which might have not been started */
-        return;
+        return 0;
     }
 
     r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
@@ -1412,6 +1412,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
                        0, virtio_queue_get_avail_size(vdev, idx));
     vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
                        0, virtio_queue_get_desc_size(vdev, idx));
+    return r;
 }
 
 static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
@@ -2136,9 +2137,10 @@ fail_features:
 }
 
 /* Host notifiers must be enabled at this point. */
-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
 {
     int i;
+    int rc = 0;
 
     /* should only be called after backend is connected */
     assert(hdev->vhost_ops);
@@ -2157,10 +2159,10 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
         vhost_dev_set_vring_enable(hdev, false);
     }
     for (i = 0; i < hdev->nvqs; ++i) {
-        vhost_virtqueue_stop(hdev,
-                             vdev,
-                             hdev->vqs + i,
-                             hdev->vq_index + i);
+        rc |= vhost_virtqueue_stop(hdev,
+                                   vdev,
+                                   hdev->vqs + i,
+                                   hdev->vq_index + i);
     }
     if (hdev->vhost_ops->vhost_reset_status) {
         hdev->vhost_ops->vhost_reset_status(hdev);
@@ -2177,6 +2179,7 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
     hdev->started = false;
     vdev->vhost_started = false;
     hdev->vdev = NULL;
+    return rc;
 }
 
 int vhost_net_set_backend(struct vhost_dev *hdev,
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index a9469d50bc..fd96ec9c39 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -232,8 +232,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
  * Stop the vhost device. After the device is stopped the notifiers
  * can be disabled (@vhost_dev_disable_notifiers) and the device can
  * be torn down (@vhost_dev_cleanup).
+ *
+ * Return: 0 on success, != 0 on error when stopping dev.
  */
-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
 
 /**
  * DOC: vhost device configuration handling
@@ -333,8 +335,8 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
 
 int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
                           struct vhost_virtqueue *vq, unsigned idx);
-void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
-                          struct vhost_virtqueue *vq, unsigned idx);
+int vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
+                         struct vhost_virtqueue *vq, unsigned idx);
 
 void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
 void vhost_dev_free_inflight(struct vhost_inflight *inflight);
-- 
2.44.0

[PATCH v4 3/3] vhost-user: return failure if backend crash when live migration

Posted by Haoqian He 22 hours ago

Live migration should be terminated if the vhost-user backend crashes
before the migration completes.

Specifically, since the vhost device will be stopped when VM is stopped
before the end of the live migration, in current implementation if the
backend crashes, vhost-user device set_status() won't return failure,
live migration won't perceive the disconnection between QEMU and the
backend.

When the VM is migrated to the destination, the inflight IO will be
resubmitted, and if the IO was completed out of order before, it will
cause IO error.

To fix this issue:
1. Add the return value to set_status() for VirtioDeviceClass.
  a. For the vhost-user device, return failure when the backend crashes.
  b. For other virtio devices, always return 0.
2. Return failure if vhost_dev_stop() failed for vhost-user device.

If QEMU loses connection with the vhost-user backend, virtio set_status()
can return failure to the upper layer, migration_completion() can handle
the error, terminate the live migration, and restore the VM, so that
inflight IO can be completed normally.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 backends/vhost-user.c                  | 20 +++++++++----------
 hw/block/vhost-user-blk.c              | 27 ++++++++++++++------------
 hw/block/virtio-blk.c                  |  5 +++--
 hw/char/virtio-serial-bus.c            |  3 ++-
 hw/display/vhost-user-gpu.c            | 12 +++++++++---
 hw/input/virtio-input.c                |  3 ++-
 hw/net/virtio-net.c                    |  3 ++-
 hw/scsi/vhost-scsi-common.c            | 13 +++++++------
 hw/scsi/vhost-scsi.c                   |  5 +++--
 hw/scsi/vhost-user-scsi.c              | 18 ++++++++++-------
 hw/virtio/vdpa-dev.c                   |  5 +++--
 hw/virtio/vhost-user-base.c            | 23 +++++++++++++---------
 hw/virtio/vhost-user-fs.c              | 23 +++++++++++++---------
 hw/virtio/vhost-user-scmi.c            | 27 +++++++++++++++-----------
 hw/virtio/vhost-user-vsock.c           | 15 +++++++++-----
 hw/virtio/vhost-vsock-common.c         | 12 ++++++------
 hw/virtio/vhost-vsock.c                | 11 ++++++-----
 hw/virtio/virtio-balloon.c             |  3 ++-
 hw/virtio/virtio-crypto.c              |  3 ++-
 hw/virtio/virtio-iommu.c               |  3 ++-
 hw/virtio/virtio-rng.c                 |  5 +++--
 hw/virtio/virtio.c                     | 23 +++++++++++++++-------
 include/hw/virtio/vhost-scsi-common.h  |  2 +-
 include/hw/virtio/vhost-vsock-common.h |  2 +-
 include/hw/virtio/virtio.h             |  2 +-
 include/system/vhost-user-backend.h    |  2 +-
 26 files changed, 161 insertions(+), 109 deletions(-)

diff --git a/backends/vhost-user.c b/backends/vhost-user.c
index d0e4d71a63..5409e9d4eb 100644
--- a/backends/vhost-user.c
+++ b/backends/vhost-user.c
@@ -97,30 +97,28 @@ err_host_notifiers:
     vhost_dev_disable_notifiers(&b->dev, b->vdev);
 }
 
-void
+int
 vhost_user_backend_stop(VhostUserBackend *b)
 {
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(b->vdev)));
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
-    int ret = 0;
+    int ret;
 
     if (!b->started) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&b->dev, b->vdev, true);
+    ret = vhost_dev_stop(&b->dev, b->vdev, true);
 
-    if (k->set_guest_notifiers) {
-        ret = k->set_guest_notifiers(qbus->parent,
-                                     b->dev.nvqs, false);
-        if (ret < 0) {
-            error_report("vhost guest notifier cleanup failed: %d", ret);
-        }
+    if (k->set_guest_notifiers &&
+        k->set_guest_notifiers(qbus->parent, b->dev.nvqs, false)< 0) {
+        error_report("vhost guest notifier cleanup failed: %d", ret);
+        return -1;
     }
-    assert(ret >= 0);
 
     vhost_dev_disable_notifiers(&b->dev, b->vdev);
     b->started = false;
+    return ret;
 }
 
 static void set_chardev(Object *obj, const char *value, Error **errp)
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index ae42327cf8..f355459123 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -204,7 +204,7 @@ err_host_notifiers:
     return ret;
 }
 
-static void vhost_user_blk_stop(VirtIODevice *vdev)
+static int vhost_user_blk_stop(VirtIODevice *vdev)
 {
     VHostUserBlk *s = VHOST_USER_BLK(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -212,26 +212,26 @@ static void vhost_user_blk_stop(VirtIODevice *vdev)
     int ret;
 
     if (!s->started_vu) {
-        return;
+        return 0;
     }
     s->started_vu = false;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&s->dev, vdev, true);
+    ret = vhost_dev_stop(&s->dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
 
     vhost_dev_disable_notifiers(&s->dev, vdev);
+    return ret;
 }
 
-static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserBlk *s = VHOST_USER_BLK(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
@@ -239,11 +239,11 @@ static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
     int ret;
 
     if (!s->connected) {
-        return;
+        return -1;
     }
 
     if (vhost_dev_is_started(&s->dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
@@ -253,9 +253,12 @@ static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
             qemu_chr_fe_disconnect(&s->chardev);
         }
     } else {
-        vhost_user_blk_stop(vdev);
+        ret = vhost_user_blk_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
-
+    return 0;
 }
 
 static uint64_t vhost_user_blk_get_features(VirtIODevice *vdev,
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 4a48a16790..4244ee3410 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1269,7 +1269,7 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, uint64_t features,
     return features;
 }
 
-static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOBlock *s = VIRTIO_BLK(vdev);
 
@@ -1278,7 +1278,7 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
     }
 
     if (!(status & VIRTIO_CONFIG_S_DRIVER_OK)) {
-        return;
+        return 0;
     }
 
     /* A guest that supports VIRTIO_BLK_F_CONFIG_WCE must be able to send
@@ -1301,6 +1301,7 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
                                    virtio_vdev_has_feature(vdev,
                                                            VIRTIO_BLK_F_WCE));
     }
+    return 0;
 }
 
 static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index b6d2743a9c..22accfd8e4 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -622,7 +622,7 @@ static void guest_reset(VirtIOSerial *vser)
     }
 }
 
-static void set_status(VirtIODevice *vdev, uint8_t status)
+static int set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOSerial *vser;
     VirtIOSerialPort *port;
@@ -650,6 +650,7 @@ static void set_status(VirtIODevice *vdev, uint8_t status)
             vsc->enable_backend(port, vdev->vm_running);
         }
     }
+    return 0;
 }
 
 static void vser_reset(VirtIODevice *vdev)
diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 2aed6243f6..d7d159ecd9 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -513,7 +513,7 @@ vhost_user_gpu_set_config(VirtIODevice *vdev,
     }
 }
 
-static void
+static int
 vhost_user_gpu_set_status(VirtIODevice *vdev, uint8_t val)
 {
     VhostUserGPU *g = VHOST_USER_GPU(vdev);
@@ -522,18 +522,24 @@ vhost_user_gpu_set_status(VirtIODevice *vdev, uint8_t val)
     if (val & VIRTIO_CONFIG_S_DRIVER_OK && vdev->vm_running) {
         if (!vhost_user_gpu_do_set_socket(g, &err)) {
             error_report_err(err);
-            return;
+            return 0;
         }
         vhost_user_backend_start(g->vhost);
     } else {
+        int ret;
+
         /* unblock any wait and stop processing */
         if (g->vhost_gpu_fd != -1) {
             vhost_user_gpu_update_blocked(g, true);
             qemu_chr_fe_deinit(&g->vhost_chr, true);
             g->vhost_gpu_fd = -1;
         }
-        vhost_user_backend_stop(g->vhost);
+        ret = vhost_user_backend_stop(g->vhost);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static bool
diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c
index 1394d99c6b..7ce53b7449 100644
--- a/hw/input/virtio-input.c
+++ b/hw/input/virtio-input.c
@@ -189,7 +189,7 @@ static uint64_t virtio_input_get_features(VirtIODevice *vdev, uint64_t f,
     return f;
 }
 
-static void virtio_input_set_status(VirtIODevice *vdev, uint8_t val)
+static int virtio_input_set_status(VirtIODevice *vdev, uint8_t val)
 {
     VirtIOInputClass *vic = VIRTIO_INPUT_GET_CLASS(vdev);
     VirtIOInput *vinput = VIRTIO_INPUT(vdev);
@@ -202,6 +202,7 @@ static void virtio_input_set_status(VirtIODevice *vdev, uint8_t val)
             }
         }
     }
+    return 0;
 }
 
 static void virtio_input_reset(VirtIODevice *vdev)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index de87cfadff..b7de824baa 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -382,7 +382,7 @@ static void virtio_net_drop_tx_queue_data(VirtIODevice *vdev, VirtQueue *vq)
     }
 }
 
-static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
+static int virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
 {
     VirtIONet *n = VIRTIO_NET(vdev);
     VirtIONetQueue *q;
@@ -437,6 +437,7 @@ static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
             }
         }
     }
+    return 0;
 }
 
 static void virtio_net_set_link_status(NetClientState *nc)
diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
index 4c8637045d..43525ba46d 100644
--- a/hw/scsi/vhost-scsi-common.c
+++ b/hw/scsi/vhost-scsi-common.c
@@ -101,24 +101,25 @@ err_host_notifiers:
     return ret;
 }
 
-void vhost_scsi_common_stop(VHostSCSICommon *vsc)
+int vhost_scsi_common_stop(VHostSCSICommon *vsc)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
     int ret = 0;
 
-    vhost_dev_stop(&vsc->dev, vdev, true);
+    ret = vhost_dev_stop(&vsc->dev, vdev, true);
 
     if (k->set_guest_notifiers) {
-        ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, false);
-        if (ret < 0) {
-                error_report("vhost guest notifier cleanup failed: %d", ret);
+        int r = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, false);
+        if (r < 0) {
+            error_report("vhost guest notifier cleanup failed: %d", ret);
+            return r;
         }
     }
-    assert(ret >= 0);
 
     vhost_dev_disable_notifiers(&vsc->dev, vdev);
+    return ret;
 }
 
 uint64_t vhost_scsi_common_get_features(VirtIODevice *vdev, uint64_t features,
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 8039d13fd9..19d5261019 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -114,7 +114,7 @@ static void vhost_scsi_stop(VHostSCSI *s)
     vhost_scsi_common_stop(vsc);
 }
 
-static void vhost_scsi_set_status(VirtIODevice *vdev, uint8_t val)
+static int vhost_scsi_set_status(VirtIODevice *vdev, uint8_t val)
 {
     VHostSCSI *s = VHOST_SCSI(vdev);
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
@@ -125,7 +125,7 @@ static void vhost_scsi_set_status(VirtIODevice *vdev, uint8_t val)
     }
 
     if (vhost_dev_is_started(&vsc->dev) == start) {
-        return;
+        return 0;
     }
 
     if (start) {
@@ -139,6 +139,7 @@ static void vhost_scsi_set_status(VirtIODevice *vdev, uint8_t val)
     } else {
         vhost_scsi_stop(s);
     }
+    return 0;
 }
 
 static void vhost_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index adb41b9816..aba9fb82d2 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -52,19 +52,19 @@ static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
     return ret;
 }
 
-static void vhost_user_scsi_stop(VHostUserSCSI *s)
+static int vhost_user_scsi_stop(VHostUserSCSI *s)
 {
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
 
     if (!s->started_vu) {
-        return;
+        return 0;
     }
     s->started_vu = false;
 
-    vhost_scsi_common_stop(vsc);
+    return vhost_scsi_common_stop(vsc);
 }
 
-static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserSCSI *s = (VHostUserSCSI *)vdev;
     DeviceState *dev = DEVICE(vdev);
@@ -75,11 +75,11 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
     int ret;
 
     if (!s->connected) {
-        return;
+        return -1;
     }
 
     if (vhost_dev_is_started(&vsc->dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
@@ -91,8 +91,12 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
             qemu_chr_fe_disconnect(&vs->conf.chardev);
         }
     } else {
-        vhost_user_scsi_stop(s);
+        ret = vhost_user_scsi_stop(s);
+        if (ret) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index a7e73b1c99..d80d46baed 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -312,7 +312,7 @@ static void vhost_vdpa_device_stop(VirtIODevice *vdev)
     vhost_dev_disable_notifiers(&s->dev, vdev);
 }
 
-static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
     bool should_start = virtio_device_started(vdev, status);
@@ -324,7 +324,7 @@ static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
     }
 
     if (s->started == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
@@ -335,6 +335,7 @@ static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
     } else {
         vhost_vdpa_device_stop(vdev);
     }
+    return 0;
 }
 
 static const Property vhost_vdpa_device_properties[] = {
diff --git a/hw/virtio/vhost-user-base.c b/hw/virtio/vhost-user-base.c
index 2bc3423326..37718653bd 100644
--- a/hw/virtio/vhost-user-base.c
+++ b/hw/virtio/vhost-user-base.c
@@ -66,7 +66,7 @@ err_host_notifiers:
     vhost_dev_disable_notifiers(&vub->vhost_dev, vdev);
 }
 
-static void vub_stop(VirtIODevice *vdev)
+static int vub_stop(VirtIODevice *vdev)
 {
     VHostUserBase *vub = VHOST_USER_BASE(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -74,34 +74,39 @@ static void vub_stop(VirtIODevice *vdev)
     int ret;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&vub->vhost_dev, vdev, true);
+    ret = vhost_dev_stop(&vub->vhost_dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, vub->vhost_dev.nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, vub->vhost_dev.nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
 
     vhost_dev_disable_notifiers(&vub->vhost_dev, vdev);
+    return ret;
 }
 
-static void vub_set_status(VirtIODevice *vdev, uint8_t status)
+static int vub_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserBase *vub = VHOST_USER_BASE(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
 
     if (vhost_dev_is_started(&vub->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
         vub_start(vdev);
     } else {
-        vub_stop(vdev);
+        int ret;
+        ret = vub_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 /*
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index 3f00d79ed0..266c179671 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -100,7 +100,7 @@ err_host_notifiers:
     vhost_dev_disable_notifiers(&fs->vhost_dev, vdev);
 }
 
-static void vuf_stop(VirtIODevice *vdev)
+static int vuf_stop(VirtIODevice *vdev)
 {
     VHostUserFS *fs = VHOST_USER_FS(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -108,34 +108,39 @@ static void vuf_stop(VirtIODevice *vdev)
     int ret;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&fs->vhost_dev, vdev, true);
+    ret = vhost_dev_stop(&fs->vhost_dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, fs->vhost_dev.nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, fs->vhost_dev.nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
 
     vhost_dev_disable_notifiers(&fs->vhost_dev, vdev);
+    return ret;
 }
 
-static void vuf_set_status(VirtIODevice *vdev, uint8_t status)
+static int vuf_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserFS *fs = VHOST_USER_FS(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
 
     if (vhost_dev_is_started(&fs->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
         vuf_start(vdev);
     } else {
-        vuf_stop(vdev);
+        int ret;
+        ret = vuf_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static uint64_t vuf_get_features(VirtIODevice *vdev,
diff --git a/hw/virtio/vhost-user-scmi.c b/hw/virtio/vhost-user-scmi.c
index 410a936ca7..a9778f1f4b 100644
--- a/hw/virtio/vhost-user-scmi.c
+++ b/hw/virtio/vhost-user-scmi.c
@@ -83,7 +83,7 @@ err_host_notifiers:
     return ret;
 }
 
-static void vu_scmi_stop(VirtIODevice *vdev)
+static int vu_scmi_stop(VirtIODevice *vdev)
 {
     VHostUserSCMI *scmi = VHOST_USER_SCMI(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -93,41 +93,46 @@ static void vu_scmi_stop(VirtIODevice *vdev)
 
     /* vhost_dev_is_started() check in the callers is not fully reliable. */
     if (!scmi->started_vu) {
-        return;
+        return 0;
     }
     scmi->started_vu = false;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(vhost_dev, vdev, true);
+    ret = vhost_dev_stop(vhost_dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, vhost_dev->nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, vhost_dev->nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
     vhost_dev_disable_notifiers(vhost_dev, vdev);
+    return ret;
 }
 
-static void vu_scmi_set_status(VirtIODevice *vdev, uint8_t status)
+static int vu_scmi_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserSCMI *scmi = VHOST_USER_SCMI(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
 
     if (!scmi->connected) {
-        return;
+        return -1;
     }
     if (vhost_dev_is_started(&scmi->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
         vu_scmi_start(vdev);
     } else {
-        vu_scmi_stop(vdev);
+        int ret;
+        ret = vu_scmi_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static uint64_t vu_scmi_get_features(VirtIODevice *vdev, uint64_t features,
diff --git a/hw/virtio/vhost-user-vsock.c b/hw/virtio/vhost-user-vsock.c
index 293273080b..7a4aec7bf4 100644
--- a/hw/virtio/vhost-user-vsock.c
+++ b/hw/virtio/vhost-user-vsock.c
@@ -54,23 +54,28 @@ const VhostDevConfigOps vsock_ops = {
     .vhost_dev_config_notifier = vuv_handle_config_change,
 };
 
-static void vuv_set_status(VirtIODevice *vdev, uint8_t status)
+static int vuv_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
+    int ret;
 
     if (vhost_dev_is_started(&vvc->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
-        int ret = vhost_vsock_common_start(vdev);
+        ret = vhost_vsock_common_start(vdev);
         if (ret < 0) {
-            return;
+            return ret;
         }
     } else {
-        vhost_vsock_common_stop(vdev);
+        ret = vhost_vsock_common_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static uint64_t vuv_get_features(VirtIODevice *vdev,
diff --git a/hw/virtio/vhost-vsock-common.c b/hw/virtio/vhost-vsock-common.c
index 9ac587d20c..bbfbbdb317 100644
--- a/hw/virtio/vhost-vsock-common.c
+++ b/hw/virtio/vhost-vsock-common.c
@@ -95,7 +95,7 @@ err_host_notifiers:
     return ret;
 }
 
-void vhost_vsock_common_stop(VirtIODevice *vdev)
+int vhost_vsock_common_stop(VirtIODevice *vdev)
 {
     VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -103,18 +103,18 @@ void vhost_vsock_common_stop(VirtIODevice *vdev)
     int ret;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&vvc->vhost_dev, vdev, true);
+    ret = vhost_dev_stop(&vvc->vhost_dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, vvc->vhost_dev.nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, vvc->vhost_dev.nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
 
     vhost_dev_disable_notifiers(&vvc->vhost_dev, vdev);
+    return ret;
 }
 
 
diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
index 940b30fa27..62b36b28f5 100644
--- a/hw/virtio/vhost-vsock.c
+++ b/hw/virtio/vhost-vsock.c
@@ -67,37 +67,38 @@ static int vhost_vsock_set_running(VirtIODevice *vdev, int start)
 }
 
 
-static void vhost_vsock_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_vsock_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
     int ret;
 
     if (vhost_dev_is_started(&vvc->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
         ret = vhost_vsock_common_start(vdev);
         if (ret < 0) {
-            return;
+            return 0;
         }
 
         ret = vhost_vsock_set_running(vdev, 1);
         if (ret < 0) {
             vhost_vsock_common_stop(vdev);
             error_report("Error starting vhost vsock: %d", -ret);
-            return;
+            return 0;
         }
     } else {
         ret = vhost_vsock_set_running(vdev, 0);
         if (ret < 0) {
             error_report("vhost vsock set running failed: %d", ret);
-            return;
+            return 0;
         }
 
         vhost_vsock_common_stop(vdev);
     }
+    return 0;
 }
 
 static uint64_t vhost_vsock_get_features(VirtIODevice *vdev,
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 2eb5a14fa2..e719c078aa 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -958,7 +958,7 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
     s->poison_val = 0;
 }
 
-static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
 
@@ -988,6 +988,7 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
             qemu_mutex_unlock(&s->free_page_lock);
         }
     }
+    return 0;
 }
 
 static ResettableState *virtio_balloon_get_reset_state(Object *obj)
diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index a1b3c90618..626279dc5b 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -1197,11 +1197,12 @@ static void virtio_crypto_vhost_status(VirtIOCrypto *c, uint8_t status)
     }
 }
 
-static void virtio_crypto_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_crypto_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(vdev);
 
     virtio_crypto_vhost_status(vcrypto, status);
+    return 0;
 }
 
 static void virtio_crypto_guest_notifier_mask(VirtIODevice *vdev, int idx,
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index b6e7e01ef7..63ec12014f 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -1522,9 +1522,10 @@ static void virtio_iommu_device_reset_exit(Object *obj, ResetType type)
                                    NULL, NULL, virtio_iommu_put_endpoint);
 }
 
-static void virtio_iommu_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_iommu_set_status(VirtIODevice *vdev, uint8_t status)
 {
     trace_virtio_iommu_device_status(status);
+    return 0;
 }
 
 static void virtio_iommu_instance_init(Object *obj)
diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
index a515fc5cd9..4748d039f9 100644
--- a/hw/virtio/virtio-rng.c
+++ b/hw/virtio/virtio-rng.c
@@ -159,17 +159,18 @@ static void check_rate_limit(void *opaque)
     vrng->activate_timer = true;
 }
 
-static void virtio_rng_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_rng_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIORNG *vrng = VIRTIO_RNG(vdev);
 
     if (!vdev->vm_running) {
-        return;
+        return 0;
     }
     vdev->status = status;
 
     /* Something changed, try to process buffers */
     virtio_rng_process(vrng);
+    return 0;
 }
 
 static void virtio_rng_device_realize(DeviceState *dev, Error **errp)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 202a052053..8d292b45c5 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2221,12 +2221,12 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val)
 {
     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
     trace_virtio_set_status(vdev, val);
+    int ret = 0;
 
     if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
         if (!(vdev->status & VIRTIO_CONFIG_S_FEATURES_OK) &&
             val & VIRTIO_CONFIG_S_FEATURES_OK) {
-            int ret = virtio_validate_features(vdev);
-
+            ret = virtio_validate_features(vdev);
             if (ret) {
                 return ret;
             }
@@ -2239,11 +2239,15 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val)
     }
 
     if (k->set_status) {
-        k->set_status(vdev, val);
+        ret = k->set_status(vdev, val);
+        if (ret) {
+            qemu_log("set %s status to %d failed, old status: %d\n",
+                     vdev->name, val, vdev->status);
+        }
     }
     vdev->status = val;
 
-    return 0;
+    return ret;
 }
 
 static enum virtio_device_endian virtio_default_endian(void)
@@ -3419,7 +3423,7 @@ void virtio_cleanup(VirtIODevice *vdev)
     qemu_del_vm_change_state_handler(vdev->vmstate);
 }
 
-static void virtio_vmstate_change(void *opaque, bool running, RunState state)
+static int virtio_vmstate_change(void *opaque, bool running, RunState state)
 {
     VirtIODevice *vdev = opaque;
     BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
@@ -3436,8 +3440,13 @@ static void virtio_vmstate_change(void *opaque, bool running, RunState state)
     }
 
     if (!backend_run) {
-        virtio_set_status(vdev, vdev->status);
+        // the return value was used for stopping VM during migration
+        int ret = virtio_set_status(vdev, vdev->status);
+        if (ret) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 void virtio_instance_init_common(Object *proxy_obj, void *data,
@@ -3489,7 +3498,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
         vdev->config = NULL;
     }
     vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
-            virtio_vmstate_change, NULL, vdev);
+            NULL, virtio_vmstate_change, vdev);
     vdev->device_endian = virtio_default_endian();
     vdev->use_guest_notifier_mask = true;
 }
diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
index c5d2c09455..d54d9c916f 100644
--- a/include/hw/virtio/vhost-scsi-common.h
+++ b/include/hw/virtio/vhost-scsi-common.h
@@ -40,7 +40,7 @@ struct VHostSCSICommon {
 };
 
 int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
-void vhost_scsi_common_stop(VHostSCSICommon *vsc);
+int vhost_scsi_common_stop(VHostSCSICommon *vsc);
 char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
                                         DeviceState *dev);
 void vhost_scsi_common_set_config(VirtIODevice *vdev, const uint8_t *config);
diff --git a/include/hw/virtio/vhost-vsock-common.h b/include/hw/virtio/vhost-vsock-common.h
index 75a74e8a99..01bf6062af 100644
--- a/include/hw/virtio/vhost-vsock-common.h
+++ b/include/hw/virtio/vhost-vsock-common.h
@@ -42,7 +42,7 @@ struct VHostVSockCommon {
 };
 
 int vhost_vsock_common_start(VirtIODevice *vdev);
-void vhost_vsock_common_stop(VirtIODevice *vdev);
+int vhost_vsock_common_stop(VirtIODevice *vdev);
 int vhost_vsock_common_pre_save(void *opaque);
 int vhost_vsock_common_post_load(void *opaque, int version_id);
 void vhost_vsock_common_realize(VirtIODevice *vdev);
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 6386910280..a7ee9bec75 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -186,7 +186,7 @@ struct VirtioDeviceClass {
     void (*get_config)(VirtIODevice *vdev, uint8_t *config);
     void (*set_config)(VirtIODevice *vdev, const uint8_t *config);
     void (*reset)(VirtIODevice *vdev);
-    void (*set_status)(VirtIODevice *vdev, uint8_t val);
+    int (*set_status)(VirtIODevice *vdev, uint8_t val);
     /* Device must validate queue_index.  */
     void (*queue_reset)(VirtIODevice *vdev, uint32_t queue_index);
     /* Device must validate queue_index.  */
diff --git a/include/system/vhost-user-backend.h b/include/system/vhost-user-backend.h
index 327b0b84f1..87765a1218 100644
--- a/include/system/vhost-user-backend.h
+++ b/include/system/vhost-user-backend.h
@@ -43,6 +43,6 @@ struct VhostUserBackend {
 int vhost_user_backend_dev_init(VhostUserBackend *b, VirtIODevice *vdev,
                                 unsigned nvqs, Error **errp);
 void vhost_user_backend_start(VhostUserBackend *b);
-void vhost_user_backend_stop(VhostUserBackend *b);
+int vhost_user_backend_stop(VhostUserBackend *b);
 
 #endif
-- 
2.44.0

Re: [PATCH v3 0/3] vhost: fix the IO error after live migration

Posted by Lei Yang 2 days, 6 hours ago

QE tested this series v3 with virtio-net regression tests, everything works
fine.

Tested-by: Lei Yang <leiyang@redhat.com>

On Thu, Mar 27, 2025 at 2:46 PM Haoqian He <haoqian.he@smartx.com> wrote:

> At the end of the VM live migration, the vhost device will be stopped.
> Currently, if the vhost-user backend crashes, vhost device's set_status()
> would not return failure, live migration won't perceive the disconnection
> with the backend. After the live migration is successful, the stale
> inflight
> IO will be submitted to the migration target host, which may be leading to
> the IO error.
>
> The following patch series fixes the issue by making the live migration
> aware of the loss of connection with the vhost-user backend and aborting
> the live migration.
>
> ---
>   v1 ... v2
>     1. Fix some grammar issues in commit message.
>     2. Remove assert in vhost_scsi_common_stop and return error upwards.
>
>   v2 ... v3
>     1. Added more detailed comments and commit message.
>     2. Change the newly added type name and parameter name.
>     3. Remove set_status_ext, change the return type of set_status to int.
>
> Haoqian He (3):
>   system/runstate: add VM state change cb with return value
>   vhost: return failure if stop virtqueue failed in vhost_dev_stop
>   vhost-user: return failure if backend crash when live migration
>
>  backends/vhost-user.c                  | 20 +++++++--------
>  hw/block/vhost-user-blk.c              | 27 +++++++++++---------
>  hw/block/virtio-blk.c                  |  7 +++---
>  hw/char/virtio-serial-bus.c            |  3 ++-
>  hw/core/vm-change-state-handler.c      | 18 ++++++++-----
>  hw/display/vhost-user-gpu.c            | 12 ++++++---
>  hw/input/virtio-input.c                |  3 ++-
>  hw/net/virtio-net.c                    |  3 ++-
>  hw/scsi/scsi-bus.c                     |  2 +-
>  hw/scsi/vhost-scsi-common.c            | 13 +++++-----
>  hw/scsi/vhost-scsi.c                   |  5 ++--
>  hw/scsi/vhost-user-scsi.c              | 18 +++++++------
>  hw/vfio/migration.c                    |  2 +-
>  hw/virtio/vdpa-dev.c                   |  5 ++--
>  hw/virtio/vhost-user-base.c            | 23 ++++++++++-------
>  hw/virtio/vhost-user-fs.c              | 23 ++++++++++-------
>  hw/virtio/vhost-user-scmi.c            | 27 ++++++++++++--------
>  hw/virtio/vhost-user-vsock.c           | 15 +++++++----
>  hw/virtio/vhost-vsock-common.c         | 12 ++++-----
>  hw/virtio/vhost-vsock.c                | 11 ++++----
>  hw/virtio/vhost.c                      | 23 +++++++++--------
>  hw/virtio/virtio-balloon.c             |  3 ++-
>  hw/virtio/virtio-crypto.c              |  3 ++-
>  hw/virtio/virtio-iommu.c               |  3 ++-
>  hw/virtio/virtio-rng.c                 |  5 ++--
>  hw/virtio/virtio.c                     | 23 +++++++++++------
>  include/hw/virtio/vhost-scsi-common.h  |  2 +-
>  include/hw/virtio/vhost-vsock-common.h |  2 +-
>  include/hw/virtio/vhost.h              |  8 +++---
>  include/hw/virtio/virtio.h             |  2 +-
>  include/system/runstate.h              | 13 +++++++---
>  include/system/vhost-user-backend.h    |  2 +-
>  system/cpus.c                          |  8 ++++--
>  system/runstate.c                      | 35 ++++++++++++++++++++++----
>  34 files changed, 239 insertions(+), 142 deletions(-)
>
> --
> 2.44.0
>
>
>

[PATCH v3 1/3] system/runstate: add VM state change cb with return value

Posted by Haoqian He 6 days, 6 hours ago

This patch adds the new VM state change cb type `VMChangeStateHandlerWithRet`,
which has return value for `VMChangeStateEntry`.

Thus, we can register a new VM state change cb with return value for device.
Note that `VMChangeStateHandler` and `VMChangeStateHandlerWithRet` are mutually
exclusive and cannot be provided at the same time.

This patch is the pre patch for 'vhost-user: return failure if backend crashes
when live migration', which makes the live migration aware of the loss of
connection with the vhost-user backend and aborts the live migration.

Virtio device will use VMChangeStateHandlerWithRet.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 hw/block/virtio-blk.c             |  2 +-
 hw/core/vm-change-state-handler.c | 18 ++++++++++------
 hw/scsi/scsi-bus.c                |  2 +-
 hw/vfio/migration.c               |  2 +-
 hw/virtio/virtio.c                |  2 +-
 include/system/runstate.h         | 13 +++++++++---
 system/cpus.c                     |  8 +++++--
 system/runstate.c                 | 35 ++++++++++++++++++++++++++-----
 8 files changed, 62 insertions(+), 20 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 5077793e5e..458871ea2b 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1802,7 +1802,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
      * called after ->start_ioeventfd() has already set blk's AioContext.
      */
     s->change =
-        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, s);
+        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, NULL, s);
 
     blk_ram_registrar_init(&s->blk_ram_registrar, s->blk);
     blk_set_dev_ops(s->blk, &virtio_block_ops, s);
diff --git a/hw/core/vm-change-state-handler.c b/hw/core/vm-change-state-handler.c
index 7064995578..99c642b558 100644
--- a/hw/core/vm-change-state-handler.c
+++ b/hw/core/vm-change-state-handler.c
@@ -40,6 +40,7 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
  * qdev_add_vm_change_state_handler:
  * @dev: the device that owns this handler
  * @cb: the callback function to be invoked
+ * @cb_ret: the callback function with return value to be invoked
  * @opaque: user data passed to the callback function
  *
  * This function works like qemu_add_vm_change_state_handler() except callbacks
@@ -50,25 +51,30 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
  * controller's callback is invoked before the children on its bus when the VM
  * starts running.  The order is reversed when the VM stops running.
  *
+ * Note that the parameter `cb` and `cb_ret` are mutually exclusive.
+ *
  * Returns: an entry to be freed with qemu_del_vm_change_state_handler()
  */
 VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
                                                      VMChangeStateHandler *cb,
+                                                     VMChangeStateHandlerWithRet *cb_ret,
                                                      void *opaque)
 {
-    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, opaque);
+    assert(!cb || !cb_ret);
+    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, cb_ret, opaque);
 }
 
 /*
  * Exactly like qdev_add_vm_change_state_handler() but passes a prepare_cb
- * argument too.
+ * and the cb_ret arguments too.
  */
 VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
-    DeviceState *dev, VMChangeStateHandler *cb,
-    VMChangeStateHandler *prepare_cb, void *opaque)
+    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
+    VMChangeStateHandlerWithRet *cb_ret, void *opaque)
 {
     int depth = qdev_get_dev_tree_depth(dev);
 
-    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, opaque,
-                                                      depth);
+    assert(!cb || !cb_ret);
+    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, cb_ret,
+                                                      opaque, depth);
 }
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index ece1107ee8..4fa0d9769a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -400,7 +400,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
         return;
     }
     dev->vmsentry = qdev_add_vm_change_state_handler(DEVICE(dev),
-            scsi_dma_restart_cb, dev);
+            scsi_dma_restart_cb, NULL, dev);
 }
 
 static void scsi_qdev_unrealize(DeviceState *qdev)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index fbff46cfc3..42948dffc5 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -1014,7 +1014,7 @@ static int vfio_migration_init(VFIODevice *vbasedev)
                      vfio_vmstate_change_prepare :
                      NULL;
     migration->vm_state = qdev_add_vm_change_state_handler_full(
-        vbasedev->dev, vfio_vmstate_change, prepare_cb, vbasedev);
+        vbasedev->dev, vfio_vmstate_change, prepare_cb, NULL, vbasedev);
     migration_add_notifier(&migration->migration_state,
                            vfio_migration_state_notifier);
 
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 85110bce37..202a052053 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3489,7 +3489,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
         vdev->config = NULL;
     }
     vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
-            virtio_vmstate_change, vdev);
+            virtio_vmstate_change, NULL, vdev);
     vdev->device_endian = virtio_default_endian();
     vdev->use_guest_notifier_mask = true;
 }
diff --git a/include/system/runstate.h b/include/system/runstate.h
index bffc3719d4..fdd5c4a517 100644
--- a/include/system/runstate.h
+++ b/include/system/runstate.h
@@ -12,6 +12,7 @@ bool runstate_needs_reset(void);
 void runstate_replay_enable(void);
 
 typedef void VMChangeStateHandler(void *opaque, bool running, RunState state);
+typedef int VMChangeStateHandlerWithRet(void *opaque, bool running, RunState state);
 
 VMChangeStateEntry *qemu_add_vm_change_state_handler(VMChangeStateHandler *cb,
                                                      void *opaque);
@@ -20,21 +21,27 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
 VMChangeStateEntry *
 qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
                                            VMChangeStateHandler *prepare_cb,
+                                           VMChangeStateHandlerWithRet *cb_ret,
                                            void *opaque, int priority);
 VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
                                                      VMChangeStateHandler *cb,
+                                                     VMChangeStateHandlerWithRet *cb_ret,
                                                      void *opaque);
 VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
-    DeviceState *dev, VMChangeStateHandler *cb,
-    VMChangeStateHandler *prepare_cb, void *opaque);
+    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
+    VMChangeStateHandlerWithRet *cb_ret, void *opaque);
 void qemu_del_vm_change_state_handler(VMChangeStateEntry *e);
 /**
  * vm_state_notify: Notify the state of the VM
  *
  * @running: whether the VM is running or not.
  * @state: the #RunState of the VM.
+ *
+ * Return the result of the callback which has return value.
+ * If no callback has return value, still return 0 and the
+ * upper layer should not do additional processing.
  */
-void vm_state_notify(bool running, RunState state);
+int vm_state_notify(bool running, RunState state);
 
 static inline bool shutdown_caused_by_guest(ShutdownCause cause)
 {
diff --git a/system/cpus.c b/system/cpus.c
index 2cc5f887ab..d16b0dff98 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -299,14 +299,18 @@ static int do_vm_stop(RunState state, bool send_stop)
         if (oldstate == RUN_STATE_RUNNING) {
             pause_all_vcpus();
         }
-        vm_state_notify(0, state);
+        ret = vm_state_notify(0, state);
         if (send_stop) {
             qapi_event_send_stop();
         }
     }
 
     bdrv_drain_all();
-    ret = bdrv_flush_all();
+    /*
+     * Even if vm_state_notify() return failure,
+     * it would be better to flush as before.
+     */
+    ret |= bdrv_flush_all();
     trace_vm_stop_flush_all(ret);
 
     return ret;
diff --git a/system/runstate.c b/system/runstate.c
index 272801d307..de74d962bc 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -297,6 +297,7 @@ void qemu_system_vmstop_request(RunState state)
 struct VMChangeStateEntry {
     VMChangeStateHandler *cb;
     VMChangeStateHandler *prepare_cb;
+    VMChangeStateHandlerWithRet *cb_ret;
     void *opaque;
     QTAILQ_ENTRY(VMChangeStateEntry) entries;
     int priority;
@@ -320,14 +321,15 @@ static QTAILQ_HEAD(, VMChangeStateEntry) vm_change_state_head =
 VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
         VMChangeStateHandler *cb, void *opaque, int priority)
 {
-    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, opaque,
-                                                      priority);
+    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, NULL,
+                                                      opaque, priority);
 }
 
 /**
  * qemu_add_vm_change_state_handler_prio_full:
  * @cb: the main callback to invoke
  * @prepare_cb: a callback to invoke before the main callback
+ * @cb_ret: the main callback to invoke with return value
  * @opaque: user data passed to the callbacks
  * @priority: low priorities execute first when the vm runs and the reverse is
  *            true when the vm stops
@@ -344,6 +346,7 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
 VMChangeStateEntry *
 qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
                                            VMChangeStateHandler *prepare_cb,
+                                           VMChangeStateHandlerWithRet *cb_ret,
                                            void *opaque, int priority)
 {
     VMChangeStateEntry *e;
@@ -352,6 +355,7 @@ qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
     e = g_malloc0(sizeof(*e));
     e->cb = cb;
     e->prepare_cb = prepare_cb;
+    e->cb_ret = cb_ret;
     e->opaque = opaque;
     e->priority = priority;
 
@@ -379,9 +383,10 @@ void qemu_del_vm_change_state_handler(VMChangeStateEntry *e)
     g_free(e);
 }
 
-void vm_state_notify(bool running, RunState state)
+int vm_state_notify(bool running, RunState state)
 {
     VMChangeStateEntry *e, *next;
+    int ret = 0;
 
     trace_vm_state_notify(running, state, RunState_str(state));
 
@@ -393,7 +398,17 @@ void vm_state_notify(bool running, RunState state)
         }
 
         QTAILQ_FOREACH_SAFE(e, &vm_change_state_head, entries, next) {
-            e->cb(e->opaque, running, state);
+            if (e->cb) {
+                e->cb(e->opaque, running, state);
+            } else if (e->cb_ret) {
+                /*
+                 * Here ignore the return value of cb_ret because
+                 * we only care about the stopping the device during
+                 * the VM live migration to indicate whether the
+                 * connection between qemu and backend is normal.
+                 */
+                e->cb_ret(e->opaque, running, state);
+            }
         }
     } else {
         QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
@@ -403,9 +418,19 @@ void vm_state_notify(bool running, RunState state)
         }
 
         QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
-            e->cb(e->opaque, running, state);
+            if (e->cb) {
+                e->cb(e->opaque, running, state);
+            } else if (e->cb_ret) {
+                /*
+                 * We should execute all registered callbacks even if
+                 * one of them returns failure, otherwise, some cleanup
+                 * work of the device will be skipped.
+                 */
+                ret |= e->cb_ret(e->opaque, running, state);
+            }
         }
     }
+    return ret;
 }
 
 static ShutdownCause reset_requested;
-- 
2.44.0

[PATCH v3 2/3] vhost: return failure if stop virtqueue failed in vhost_dev_stop

Posted by Haoqian He 6 days, 6 hours ago

This patch captures the error of vhost_virtqueue_stop() in vhost_dev_stop()
and returns the error upward.

Specifically, if QEMU is disconnected from the vhost backend, some actions
in vhost_dev_stop() will fail, such as sending vhost-user messages to the
backend (GET_VRING_BASE, SET_VRING_ENABLE) and vhost_reset_status.

Considering that both set_vring_enable and vhost_reset_status require setting
the specific virtio feature bit, we can capture vhost_virtqueue_stop()'s
error to indicate that QEMU has lost connection with the backend.

This patch is the pre patch for 'vhost-user: return failure if backend crashes
when live migration', which makes the live migration aware of the loss of
connection with the vhost-user backend and aborts the live migration.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 hw/virtio/vhost.c         | 23 +++++++++++++----------
 include/hw/virtio/vhost.h |  8 +++++---
 2 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 6aa72fd434..bdd945c18d 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1368,10 +1368,10 @@ fail_alloc_desc:
     return r;
 }
 
-void vhost_virtqueue_stop(struct vhost_dev *dev,
-                          struct VirtIODevice *vdev,
-                          struct vhost_virtqueue *vq,
-                          unsigned idx)
+int vhost_virtqueue_stop(struct vhost_dev *dev,
+                         struct VirtIODevice *vdev,
+                         struct vhost_virtqueue *vq,
+                         unsigned idx)
 {
     int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
     struct vhost_vring_state state = {
@@ -1381,7 +1381,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
 
     if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
         /* Don't stop the virtqueue which might have not been started */
-        return;
+        return 0;
     }
 
     r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
@@ -1412,6 +1412,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
                        0, virtio_queue_get_avail_size(vdev, idx));
     vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
                        0, virtio_queue_get_desc_size(vdev, idx));
+    return r;
 }
 
 static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
@@ -2136,9 +2137,10 @@ fail_features:
 }
 
 /* Host notifiers must be enabled at this point. */
-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
 {
     int i;
+    int rc = 0;
 
     /* should only be called after backend is connected */
     assert(hdev->vhost_ops);
@@ -2157,10 +2159,10 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
         vhost_dev_set_vring_enable(hdev, false);
     }
     for (i = 0; i < hdev->nvqs; ++i) {
-        vhost_virtqueue_stop(hdev,
-                             vdev,
-                             hdev->vqs + i,
-                             hdev->vq_index + i);
+        rc |= vhost_virtqueue_stop(hdev,
+                                   vdev,
+                                   hdev->vqs + i,
+                                   hdev->vq_index + i);
     }
     if (hdev->vhost_ops->vhost_reset_status) {
         hdev->vhost_ops->vhost_reset_status(hdev);
@@ -2177,6 +2179,7 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
     hdev->started = false;
     vdev->vhost_started = false;
     hdev->vdev = NULL;
+    return rc;
 }
 
 int vhost_net_set_backend(struct vhost_dev *hdev,
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index a9469d50bc..fd96ec9c39 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -232,8 +232,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
  * Stop the vhost device. After the device is stopped the notifiers
  * can be disabled (@vhost_dev_disable_notifiers) and the device can
  * be torn down (@vhost_dev_cleanup).
+ *
+ * Return: 0 on success, != 0 on error when stopping dev.
  */
-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
 
 /**
  * DOC: vhost device configuration handling
@@ -333,8 +335,8 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
 
 int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
                           struct vhost_virtqueue *vq, unsigned idx);
-void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
-                          struct vhost_virtqueue *vq, unsigned idx);
+int vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
+                         struct vhost_virtqueue *vq, unsigned idx);
 
 void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
 void vhost_dev_free_inflight(struct vhost_inflight *inflight);
-- 
2.44.0

[PATCH v3 3/3] vhost-user: return failure if backend crash when live migration

Posted by Haoqian He 6 days, 6 hours ago

Live migration should be terminated if the vhost-user backend crashes
before the migration completes.

Specifically, since the vhost device will be stopped when VM is stopped
before the end of the live migration, in current implementation if the
backend crashes, vhost-user device set_status() won't return failure,
live migration won't perceive the disconnection between QEMU and the
backend.

When the VM is migrated to the destination, the inflight IO will be
resubmitted, and if the IO was completed out of order before, it will
cause IO error.

To fix this issue:
1. Add the return value to set_status() for VirtioDeviceClass.
2. Return failure if vhost_dev_stop() failed for vhost-user device.

If QEMU loses connection with the vhost-user backend, virtio set_status()
can return failure to the upper layer, migration_completion() can handle
the error, terminate the live migration, and restore the VM, so that
inflight IO can be completed normally.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 backends/vhost-user.c                  | 20 +++++++++----------
 hw/block/vhost-user-blk.c              | 27 ++++++++++++++------------
 hw/block/virtio-blk.c                  |  5 +++--
 hw/char/virtio-serial-bus.c            |  3 ++-
 hw/display/vhost-user-gpu.c            | 12 +++++++++---
 hw/input/virtio-input.c                |  3 ++-
 hw/net/virtio-net.c                    |  3 ++-
 hw/scsi/vhost-scsi-common.c            | 13 +++++++------
 hw/scsi/vhost-scsi.c                   |  5 +++--
 hw/scsi/vhost-user-scsi.c              | 18 ++++++++++-------
 hw/virtio/vdpa-dev.c                   |  5 +++--
 hw/virtio/vhost-user-base.c            | 23 +++++++++++++---------
 hw/virtio/vhost-user-fs.c              | 23 +++++++++++++---------
 hw/virtio/vhost-user-scmi.c            | 27 +++++++++++++++-----------
 hw/virtio/vhost-user-vsock.c           | 15 +++++++++-----
 hw/virtio/vhost-vsock-common.c         | 12 ++++++------
 hw/virtio/vhost-vsock.c                | 11 ++++++-----
 hw/virtio/virtio-balloon.c             |  3 ++-
 hw/virtio/virtio-crypto.c              |  3 ++-
 hw/virtio/virtio-iommu.c               |  3 ++-
 hw/virtio/virtio-rng.c                 |  5 +++--
 hw/virtio/virtio.c                     | 23 ++++++++++++++--------
 include/hw/virtio/vhost-scsi-common.h  |  2 +-
 include/hw/virtio/vhost-vsock-common.h |  2 +-
 include/hw/virtio/virtio.h             |  2 +-
 include/system/vhost-user-backend.h    |  2 +-
 26 files changed, 160 insertions(+), 110 deletions(-)

diff --git a/backends/vhost-user.c b/backends/vhost-user.c
index d0e4d71a63..5409e9d4eb 100644
--- a/backends/vhost-user.c
+++ b/backends/vhost-user.c
@@ -97,30 +97,28 @@ err_host_notifiers:
     vhost_dev_disable_notifiers(&b->dev, b->vdev);
 }
 
-void
+int
 vhost_user_backend_stop(VhostUserBackend *b)
 {
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(b->vdev)));
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
-    int ret = 0;
+    int ret;
 
     if (!b->started) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&b->dev, b->vdev, true);
+    ret = vhost_dev_stop(&b->dev, b->vdev, true);
 
-    if (k->set_guest_notifiers) {
-        ret = k->set_guest_notifiers(qbus->parent,
-                                     b->dev.nvqs, false);
-        if (ret < 0) {
-            error_report("vhost guest notifier cleanup failed: %d", ret);
-        }
+    if (k->set_guest_notifiers &&
+        k->set_guest_notifiers(qbus->parent, b->dev.nvqs, false)< 0) {
+        error_report("vhost guest notifier cleanup failed: %d", ret);
+        return -1;
     }
-    assert(ret >= 0);
 
     vhost_dev_disable_notifiers(&b->dev, b->vdev);
     b->started = false;
+    return ret;
 }
 
 static void set_chardev(Object *obj, const char *value, Error **errp)
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index ae42327cf8..f355459123 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -204,7 +204,7 @@ err_host_notifiers:
     return ret;
 }
 
-static void vhost_user_blk_stop(VirtIODevice *vdev)
+static int vhost_user_blk_stop(VirtIODevice *vdev)
 {
     VHostUserBlk *s = VHOST_USER_BLK(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -212,26 +212,26 @@ static void vhost_user_blk_stop(VirtIODevice *vdev)
     int ret;
 
     if (!s->started_vu) {
-        return;
+        return 0;
     }
     s->started_vu = false;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&s->dev, vdev, true);
+    ret = vhost_dev_stop(&s->dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
 
     vhost_dev_disable_notifiers(&s->dev, vdev);
+    return ret;
 }
 
-static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserBlk *s = VHOST_USER_BLK(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
@@ -239,11 +239,11 @@ static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
     int ret;
 
     if (!s->connected) {
-        return;
+        return -1;
     }
 
     if (vhost_dev_is_started(&s->dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
@@ -253,9 +253,12 @@ static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
             qemu_chr_fe_disconnect(&s->chardev);
         }
     } else {
-        vhost_user_blk_stop(vdev);
+        ret = vhost_user_blk_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
-
+    return 0;
 }
 
 static uint64_t vhost_user_blk_get_features(VirtIODevice *vdev,
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 458871ea2b..caf8042cd5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1270,7 +1270,7 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, uint64_t features,
     return features;
 }
 
-static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOBlock *s = VIRTIO_BLK(vdev);
 
@@ -1279,7 +1279,7 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
     }
 
     if (!(status & VIRTIO_CONFIG_S_DRIVER_OK)) {
-        return;
+        return 0;
     }
 
     /* A guest that supports VIRTIO_BLK_F_CONFIG_WCE must be able to send
@@ -1302,6 +1302,7 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
                                    virtio_vdev_has_feature(vdev,
                                                            VIRTIO_BLK_F_WCE));
     }
+    return 0;
 }
 
 static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index b6d2743a9c..22accfd8e4 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -622,7 +622,7 @@ static void guest_reset(VirtIOSerial *vser)
     }
 }
 
-static void set_status(VirtIODevice *vdev, uint8_t status)
+static int set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOSerial *vser;
     VirtIOSerialPort *port;
@@ -650,6 +650,7 @@ static void set_status(VirtIODevice *vdev, uint8_t status)
             vsc->enable_backend(port, vdev->vm_running);
         }
     }
+    return 0;
 }
 
 static void vser_reset(VirtIODevice *vdev)
diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 2aed6243f6..d7d159ecd9 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -513,7 +513,7 @@ vhost_user_gpu_set_config(VirtIODevice *vdev,
     }
 }
 
-static void
+static int
 vhost_user_gpu_set_status(VirtIODevice *vdev, uint8_t val)
 {
     VhostUserGPU *g = VHOST_USER_GPU(vdev);
@@ -522,18 +522,24 @@ vhost_user_gpu_set_status(VirtIODevice *vdev, uint8_t val)
     if (val & VIRTIO_CONFIG_S_DRIVER_OK && vdev->vm_running) {
         if (!vhost_user_gpu_do_set_socket(g, &err)) {
             error_report_err(err);
-            return;
+            return 0;
         }
         vhost_user_backend_start(g->vhost);
     } else {
+        int ret;
+
         /* unblock any wait and stop processing */
         if (g->vhost_gpu_fd != -1) {
             vhost_user_gpu_update_blocked(g, true);
             qemu_chr_fe_deinit(&g->vhost_chr, true);
             g->vhost_gpu_fd = -1;
         }
-        vhost_user_backend_stop(g->vhost);
+        ret = vhost_user_backend_stop(g->vhost);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static bool
diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c
index 1394d99c6b..7ce53b7449 100644
--- a/hw/input/virtio-input.c
+++ b/hw/input/virtio-input.c
@@ -189,7 +189,7 @@ static uint64_t virtio_input_get_features(VirtIODevice *vdev, uint64_t f,
     return f;
 }
 
-static void virtio_input_set_status(VirtIODevice *vdev, uint8_t val)
+static int virtio_input_set_status(VirtIODevice *vdev, uint8_t val)
 {
     VirtIOInputClass *vic = VIRTIO_INPUT_GET_CLASS(vdev);
     VirtIOInput *vinput = VIRTIO_INPUT(vdev);
@@ -202,6 +202,7 @@ static void virtio_input_set_status(VirtIODevice *vdev, uint8_t val)
             }
         }
     }
+    return 0;
 }
 
 static void virtio_input_reset(VirtIODevice *vdev)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index de87cfadff..b7de824baa 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -382,7 +382,7 @@ static void virtio_net_drop_tx_queue_data(VirtIODevice *vdev, VirtQueue *vq)
     }
 }
 
-static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
+static int virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
 {
     VirtIONet *n = VIRTIO_NET(vdev);
     VirtIONetQueue *q;
@@ -437,6 +437,7 @@ static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
             }
         }
     }
+    return 0;
 }
 
 static void virtio_net_set_link_status(NetClientState *nc)
diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
index 4c8637045d..43525ba46d 100644
--- a/hw/scsi/vhost-scsi-common.c
+++ b/hw/scsi/vhost-scsi-common.c
@@ -101,24 +101,25 @@ err_host_notifiers:
     return ret;
 }
 
-void vhost_scsi_common_stop(VHostSCSICommon *vsc)
+int vhost_scsi_common_stop(VHostSCSICommon *vsc)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
     int ret = 0;
 
-    vhost_dev_stop(&vsc->dev, vdev, true);
+    ret = vhost_dev_stop(&vsc->dev, vdev, true);
 
     if (k->set_guest_notifiers) {
-        ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, false);
-        if (ret < 0) {
-                error_report("vhost guest notifier cleanup failed: %d", ret);
+        int r = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, false);
+        if (r < 0) {
+            error_report("vhost guest notifier cleanup failed: %d", ret);
+            return r;
         }
     }
-    assert(ret >= 0);
 
     vhost_dev_disable_notifiers(&vsc->dev, vdev);
+    return ret;
 }
 
 uint64_t vhost_scsi_common_get_features(VirtIODevice *vdev, uint64_t features,
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 8039d13fd9..19d5261019 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -114,7 +114,7 @@ static void vhost_scsi_stop(VHostSCSI *s)
     vhost_scsi_common_stop(vsc);
 }
 
-static void vhost_scsi_set_status(VirtIODevice *vdev, uint8_t val)
+static int vhost_scsi_set_status(VirtIODevice *vdev, uint8_t val)
 {
     VHostSCSI *s = VHOST_SCSI(vdev);
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
@@ -125,7 +125,7 @@ static void vhost_scsi_set_status(VirtIODevice *vdev, uint8_t val)
     }
 
     if (vhost_dev_is_started(&vsc->dev) == start) {
-        return;
+        return 0;
     }
 
     if (start) {
@@ -139,6 +139,7 @@ static void vhost_scsi_set_status(VirtIODevice *vdev, uint8_t val)
     } else {
         vhost_scsi_stop(s);
     }
+    return 0;
 }
 
 static void vhost_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index adb41b9816..aba9fb82d2 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -52,19 +52,19 @@ static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
     return ret;
 }
 
-static void vhost_user_scsi_stop(VHostUserSCSI *s)
+static int vhost_user_scsi_stop(VHostUserSCSI *s)
 {
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
 
     if (!s->started_vu) {
-        return;
+        return 0;
     }
     s->started_vu = false;
 
-    vhost_scsi_common_stop(vsc);
+    return vhost_scsi_common_stop(vsc);
 }
 
-static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserSCSI *s = (VHostUserSCSI *)vdev;
     DeviceState *dev = DEVICE(vdev);
@@ -75,11 +75,11 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
     int ret;
 
     if (!s->connected) {
-        return;
+        return -1;
     }
 
     if (vhost_dev_is_started(&vsc->dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
@@ -91,8 +91,12 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
             qemu_chr_fe_disconnect(&vs->conf.chardev);
         }
     } else {
-        vhost_user_scsi_stop(s);
+        ret = vhost_user_scsi_stop(s);
+        if (ret) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index a7e73b1c99..d80d46baed 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -312,7 +312,7 @@ static void vhost_vdpa_device_stop(VirtIODevice *vdev)
     vhost_dev_disable_notifiers(&s->dev, vdev);
 }
 
-static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
     bool should_start = virtio_device_started(vdev, status);
@@ -324,7 +324,7 @@ static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
     }
 
     if (s->started == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
@@ -335,6 +335,7 @@ static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
     } else {
         vhost_vdpa_device_stop(vdev);
     }
+    return 0;
 }
 
 static const Property vhost_vdpa_device_properties[] = {
diff --git a/hw/virtio/vhost-user-base.c b/hw/virtio/vhost-user-base.c
index 2bc3423326..37718653bd 100644
--- a/hw/virtio/vhost-user-base.c
+++ b/hw/virtio/vhost-user-base.c
@@ -66,7 +66,7 @@ err_host_notifiers:
     vhost_dev_disable_notifiers(&vub->vhost_dev, vdev);
 }
 
-static void vub_stop(VirtIODevice *vdev)
+static int vub_stop(VirtIODevice *vdev)
 {
     VHostUserBase *vub = VHOST_USER_BASE(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -74,34 +74,39 @@ static void vub_stop(VirtIODevice *vdev)
     int ret;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&vub->vhost_dev, vdev, true);
+    ret = vhost_dev_stop(&vub->vhost_dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, vub->vhost_dev.nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, vub->vhost_dev.nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
 
     vhost_dev_disable_notifiers(&vub->vhost_dev, vdev);
+    return ret;
 }
 
-static void vub_set_status(VirtIODevice *vdev, uint8_t status)
+static int vub_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserBase *vub = VHOST_USER_BASE(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
 
     if (vhost_dev_is_started(&vub->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
         vub_start(vdev);
     } else {
-        vub_stop(vdev);
+        int ret;
+        ret = vub_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 /*
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index 3f00d79ed0..266c179671 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -100,7 +100,7 @@ err_host_notifiers:
     vhost_dev_disable_notifiers(&fs->vhost_dev, vdev);
 }
 
-static void vuf_stop(VirtIODevice *vdev)
+static int vuf_stop(VirtIODevice *vdev)
 {
     VHostUserFS *fs = VHOST_USER_FS(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -108,34 +108,39 @@ static void vuf_stop(VirtIODevice *vdev)
     int ret;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&fs->vhost_dev, vdev, true);
+    ret = vhost_dev_stop(&fs->vhost_dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, fs->vhost_dev.nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, fs->vhost_dev.nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
 
     vhost_dev_disable_notifiers(&fs->vhost_dev, vdev);
+    return ret;
 }
 
-static void vuf_set_status(VirtIODevice *vdev, uint8_t status)
+static int vuf_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserFS *fs = VHOST_USER_FS(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
 
     if (vhost_dev_is_started(&fs->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
         vuf_start(vdev);
     } else {
-        vuf_stop(vdev);
+        int ret;
+        ret = vuf_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static uint64_t vuf_get_features(VirtIODevice *vdev,
diff --git a/hw/virtio/vhost-user-scmi.c b/hw/virtio/vhost-user-scmi.c
index 410a936ca7..a9778f1f4b 100644
--- a/hw/virtio/vhost-user-scmi.c
+++ b/hw/virtio/vhost-user-scmi.c
@@ -83,7 +83,7 @@ err_host_notifiers:
     return ret;
 }
 
-static void vu_scmi_stop(VirtIODevice *vdev)
+static int vu_scmi_stop(VirtIODevice *vdev)
 {
     VHostUserSCMI *scmi = VHOST_USER_SCMI(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -93,41 +93,46 @@ static void vu_scmi_stop(VirtIODevice *vdev)
 
     /* vhost_dev_is_started() check in the callers is not fully reliable. */
     if (!scmi->started_vu) {
-        return;
+        return 0;
     }
     scmi->started_vu = false;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(vhost_dev, vdev, true);
+    ret = vhost_dev_stop(vhost_dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, vhost_dev->nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, vhost_dev->nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
     vhost_dev_disable_notifiers(vhost_dev, vdev);
+    return ret;
 }
 
-static void vu_scmi_set_status(VirtIODevice *vdev, uint8_t status)
+static int vu_scmi_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserSCMI *scmi = VHOST_USER_SCMI(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
 
     if (!scmi->connected) {
-        return;
+        return -1;
     }
     if (vhost_dev_is_started(&scmi->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
         vu_scmi_start(vdev);
     } else {
-        vu_scmi_stop(vdev);
+        int ret;
+        ret = vu_scmi_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static uint64_t vu_scmi_get_features(VirtIODevice *vdev, uint64_t features,
diff --git a/hw/virtio/vhost-user-vsock.c b/hw/virtio/vhost-user-vsock.c
index 293273080b..7a4aec7bf4 100644
--- a/hw/virtio/vhost-user-vsock.c
+++ b/hw/virtio/vhost-user-vsock.c
@@ -54,23 +54,28 @@ const VhostDevConfigOps vsock_ops = {
     .vhost_dev_config_notifier = vuv_handle_config_change,
 };
 
-static void vuv_set_status(VirtIODevice *vdev, uint8_t status)
+static int vuv_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
+    int ret;
 
     if (vhost_dev_is_started(&vvc->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
-        int ret = vhost_vsock_common_start(vdev);
+        ret = vhost_vsock_common_start(vdev);
         if (ret < 0) {
-            return;
+            return ret;
         }
     } else {
-        vhost_vsock_common_stop(vdev);
+        ret = vhost_vsock_common_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static uint64_t vuv_get_features(VirtIODevice *vdev,
diff --git a/hw/virtio/vhost-vsock-common.c b/hw/virtio/vhost-vsock-common.c
index 9ac587d20c..bbfbbdb317 100644
--- a/hw/virtio/vhost-vsock-common.c
+++ b/hw/virtio/vhost-vsock-common.c
@@ -95,7 +95,7 @@ err_host_notifiers:
     return ret;
 }
 
-void vhost_vsock_common_stop(VirtIODevice *vdev)
+int vhost_vsock_common_stop(VirtIODevice *vdev)
 {
     VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -103,18 +103,18 @@ void vhost_vsock_common_stop(VirtIODevice *vdev)
     int ret;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&vvc->vhost_dev, vdev, true);
+    ret = vhost_dev_stop(&vvc->vhost_dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, vvc->vhost_dev.nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, vvc->vhost_dev.nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
 
     vhost_dev_disable_notifiers(&vvc->vhost_dev, vdev);
+    return ret;
 }
 
 
diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
index 940b30fa27..62b36b28f5 100644
--- a/hw/virtio/vhost-vsock.c
+++ b/hw/virtio/vhost-vsock.c
@@ -67,37 +67,38 @@ static int vhost_vsock_set_running(VirtIODevice *vdev, int start)
 }
 
 
-static void vhost_vsock_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_vsock_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
     int ret;
 
     if (vhost_dev_is_started(&vvc->vhost_dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
         ret = vhost_vsock_common_start(vdev);
         if (ret < 0) {
-            return;
+            return 0;
         }
 
         ret = vhost_vsock_set_running(vdev, 1);
         if (ret < 0) {
             vhost_vsock_common_stop(vdev);
             error_report("Error starting vhost vsock: %d", -ret);
-            return;
+            return 0;
         }
     } else {
         ret = vhost_vsock_set_running(vdev, 0);
         if (ret < 0) {
             error_report("vhost vsock set running failed: %d", ret);
-            return;
+            return 0;
         }
 
         vhost_vsock_common_stop(vdev);
     }
+    return 0;
 }
 
 static uint64_t vhost_vsock_get_features(VirtIODevice *vdev,
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 2eb5a14fa2..e719c078aa 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -958,7 +958,7 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
     s->poison_val = 0;
 }
 
-static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
 
@@ -988,6 +988,7 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
             qemu_mutex_unlock(&s->free_page_lock);
         }
     }
+    return 0;
 }
 
 static ResettableState *virtio_balloon_get_reset_state(Object *obj)
diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index a1b3c90618..626279dc5b 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -1197,11 +1197,12 @@ static void virtio_crypto_vhost_status(VirtIOCrypto *c, uint8_t status)
     }
 }
 
-static void virtio_crypto_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_crypto_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(vdev);
 
     virtio_crypto_vhost_status(vcrypto, status);
+    return 0;
 }
 
 static void virtio_crypto_guest_notifier_mask(VirtIODevice *vdev, int idx,
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index b6e7e01ef7..63ec12014f 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -1522,9 +1522,10 @@ static void virtio_iommu_device_reset_exit(Object *obj, ResetType type)
                                    NULL, NULL, virtio_iommu_put_endpoint);
 }
 
-static void virtio_iommu_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_iommu_set_status(VirtIODevice *vdev, uint8_t status)
 {
     trace_virtio_iommu_device_status(status);
+    return 0;
 }
 
 static void virtio_iommu_instance_init(Object *obj)
diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
index a515fc5cd9..4748d039f9 100644
--- a/hw/virtio/virtio-rng.c
+++ b/hw/virtio/virtio-rng.c
@@ -159,17 +159,18 @@ static void check_rate_limit(void *opaque)
     vrng->activate_timer = true;
 }
 
-static void virtio_rng_set_status(VirtIODevice *vdev, uint8_t status)
+static int virtio_rng_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VirtIORNG *vrng = VIRTIO_RNG(vdev);
 
     if (!vdev->vm_running) {
-        return;
+        return 0;
     }
     vdev->status = status;
 
     /* Something changed, try to process buffers */
     virtio_rng_process(vrng);
+    return 0;
 }
 
 static void virtio_rng_device_realize(DeviceState *dev, Error **errp)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 202a052053..cdab893082 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2221,12 +2221,12 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val)
 {
     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
     trace_virtio_set_status(vdev, val);
+    int ret = 0;
 
     if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
         if (!(vdev->status & VIRTIO_CONFIG_S_FEATURES_OK) &&
             val & VIRTIO_CONFIG_S_FEATURES_OK) {
-            int ret = virtio_validate_features(vdev);
-
+            ret = virtio_validate_features(vdev);
             if (ret) {
                 return ret;
             }
@@ -2238,12 +2238,14 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val)
         virtio_set_started(vdev, val & VIRTIO_CONFIG_S_DRIVER_OK);
     }
 
-    if (k->set_status) {
-        k->set_status(vdev, val);
+    ret = k->set_status(vdev, val);
+    if (ret) {
+        qemu_log("set %s status to %d failed, old status: %d\n",
+                 vdev->name, val, vdev->status);
     }
     vdev->status = val;
 
-    return 0;
+    return ret;
 }
 
 static enum virtio_device_endian virtio_default_endian(void)
@@ -3419,7 +3421,7 @@ void virtio_cleanup(VirtIODevice *vdev)
     qemu_del_vm_change_state_handler(vdev->vmstate);
 }
 
-static void virtio_vmstate_change(void *opaque, bool running, RunState state)
+static int virtio_vmstate_change(void *opaque, bool running, RunState state)
 {
     VirtIODevice *vdev = opaque;
     BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
@@ -3436,8 +3438,13 @@ static void virtio_vmstate_change(void *opaque, bool running, RunState state)
     }
 
     if (!backend_run) {
-        virtio_set_status(vdev, vdev->status);
+        // the return value was used for stopping VM during migration
+        int ret = virtio_set_status(vdev, vdev->status);
+        if (ret) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 void virtio_instance_init_common(Object *proxy_obj, void *data,
@@ -3489,7 +3496,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
         vdev->config = NULL;
     }
     vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
-            virtio_vmstate_change, NULL, vdev);
+            NULL, virtio_vmstate_change, vdev);
     vdev->device_endian = virtio_default_endian();
     vdev->use_guest_notifier_mask = true;
 }
diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
index c5d2c09455..d54d9c916f 100644
--- a/include/hw/virtio/vhost-scsi-common.h
+++ b/include/hw/virtio/vhost-scsi-common.h
@@ -40,7 +40,7 @@ struct VHostSCSICommon {
 };
 
 int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
-void vhost_scsi_common_stop(VHostSCSICommon *vsc);
+int vhost_scsi_common_stop(VHostSCSICommon *vsc);
 char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
                                         DeviceState *dev);
 void vhost_scsi_common_set_config(VirtIODevice *vdev, const uint8_t *config);
diff --git a/include/hw/virtio/vhost-vsock-common.h b/include/hw/virtio/vhost-vsock-common.h
index 75a74e8a99..01bf6062af 100644
--- a/include/hw/virtio/vhost-vsock-common.h
+++ b/include/hw/virtio/vhost-vsock-common.h
@@ -42,7 +42,7 @@ struct VHostVSockCommon {
 };
 
 int vhost_vsock_common_start(VirtIODevice *vdev);
-void vhost_vsock_common_stop(VirtIODevice *vdev);
+int vhost_vsock_common_stop(VirtIODevice *vdev);
 int vhost_vsock_common_pre_save(void *opaque);
 int vhost_vsock_common_post_load(void *opaque, int version_id);
 void vhost_vsock_common_realize(VirtIODevice *vdev);
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 6386910280..a7ee9bec75 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -186,7 +186,7 @@ struct VirtioDeviceClass {
     void (*get_config)(VirtIODevice *vdev, uint8_t *config);
     void (*set_config)(VirtIODevice *vdev, const uint8_t *config);
     void (*reset)(VirtIODevice *vdev);
-    void (*set_status)(VirtIODevice *vdev, uint8_t val);
+    int (*set_status)(VirtIODevice *vdev, uint8_t val);
     /* Device must validate queue_index.  */
     void (*queue_reset)(VirtIODevice *vdev, uint32_t queue_index);
     /* Device must validate queue_index.  */
diff --git a/include/system/vhost-user-backend.h b/include/system/vhost-user-backend.h
index 327b0b84f1..87765a1218 100644
--- a/include/system/vhost-user-backend.h
+++ b/include/system/vhost-user-backend.h
@@ -43,6 +43,6 @@ struct VhostUserBackend {
 int vhost_user_backend_dev_init(VhostUserBackend *b, VirtIODevice *vdev,
                                 unsigned nvqs, Error **errp);
 void vhost_user_backend_start(VhostUserBackend *b);
-void vhost_user_backend_stop(VhostUserBackend *b);
+int vhost_user_backend_stop(VhostUserBackend *b);
 
 #endif
-- 
2.44.0

[PATCH v2 1/3] virtio: add VM state change cb with return value

Posted by Haoqian He 2 weeks, 5 days ago

This patch contains two changes:

1. Add VM state change cb type VMChangeStateHandlerExt which has return
value for virtio devices VMChangeStateEntry. When VM state changes,
virtio device will call the _Ext version.

2. Add return value for vm_state_notify().

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 hw/block/virtio-blk.c             |  2 +-
 hw/core/vm-change-state-handler.c | 14 ++++++++------
 hw/scsi/scsi-bus.c                |  2 +-
 hw/vfio/migration.c               |  2 +-
 hw/virtio/virtio.c                |  5 +++--
 include/system/runstate.h         | 11 ++++++++---
 system/cpus.c                     |  4 ++--
 system/runstate.c                 | 25 ++++++++++++++++++++-----
 8 files changed, 44 insertions(+), 21 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 5135b4d8f1..4a48a16790 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1928,7 +1928,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
      * called after ->start_ioeventfd() has already set blk's AioContext.
      */
     s->change =
-        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, s);
+        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, NULL, s);
 
     blk_ram_registrar_init(&s->blk_ram_registrar, s->blk);
     blk_set_dev_ops(s->blk, &virtio_block_ops, s);
diff --git a/hw/core/vm-change-state-handler.c b/hw/core/vm-change-state-handler.c
index 7064995578..d5045b17c1 100644
--- a/hw/core/vm-change-state-handler.c
+++ b/hw/core/vm-change-state-handler.c
@@ -40,6 +40,7 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
  * qdev_add_vm_change_state_handler:
  * @dev: the device that owns this handler
  * @cb: the callback function to be invoked
+ * @cb_ext: the callback function with return value to be invoked
  * @opaque: user data passed to the callback function
  *
  * This function works like qemu_add_vm_change_state_handler() except callbacks
@@ -54,21 +55,22 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
  */
 VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
                                                      VMChangeStateHandler *cb,
+                                                     VMChangeStateHandlerExt *cb_ext,
                                                      void *opaque)
 {
-    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, opaque);
+    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, cb_ext, opaque);
 }
 
 /*
  * Exactly like qdev_add_vm_change_state_handler() but passes a prepare_cb
- * argument too.
+ * and the cb_ext arguments too.
  */
 VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
-    DeviceState *dev, VMChangeStateHandler *cb,
-    VMChangeStateHandler *prepare_cb, void *opaque)
+    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
+    VMChangeStateHandlerExt *cb_ext, void *opaque)
 {
     int depth = qdev_get_dev_tree_depth(dev);
 
-    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, opaque,
-                                                      depth);
+    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, cb_ext,
+                                                      opaque, depth);
 }
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 7d4546800f..ec098f5f0a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -356,7 +356,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
         return;
     }
     dev->vmsentry = qdev_add_vm_change_state_handler(DEVICE(dev),
-            scsi_dma_restart_cb, dev);
+            scsi_dma_restart_cb, NULL, dev);
 }
 
 static void scsi_qdev_unrealize(DeviceState *qdev)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 416643ddd6..f531db83ea 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -1015,7 +1015,7 @@ static int vfio_migration_init(VFIODevice *vbasedev)
                      vfio_vmstate_change_prepare :
                      NULL;
     migration->vm_state = qdev_add_vm_change_state_handler_full(
-        vbasedev->dev, vfio_vmstate_change, prepare_cb, vbasedev);
+        vbasedev->dev, vfio_vmstate_change, prepare_cb, NULL, vbasedev);
     migration_add_notifier(&migration->migration_state,
                            vfio_migration_state_notifier);
 
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 85110bce37..5e8d4cab53 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3419,7 +3419,7 @@ void virtio_cleanup(VirtIODevice *vdev)
     qemu_del_vm_change_state_handler(vdev->vmstate);
 }
 
-static void virtio_vmstate_change(void *opaque, bool running, RunState state)
+static int virtio_vmstate_change(void *opaque, bool running, RunState state)
 {
     VirtIODevice *vdev = opaque;
     BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
@@ -3438,6 +3438,7 @@ static void virtio_vmstate_change(void *opaque, bool running, RunState state)
     if (!backend_run) {
         virtio_set_status(vdev, vdev->status);
     }
+    return 0;
 }
 
 void virtio_instance_init_common(Object *proxy_obj, void *data,
@@ -3489,7 +3490,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
         vdev->config = NULL;
     }
     vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
-            virtio_vmstate_change, vdev);
+            NULL, virtio_vmstate_change, vdev);
     vdev->device_endian = virtio_default_endian();
     vdev->use_guest_notifier_mask = true;
 }
diff --git a/include/system/runstate.h b/include/system/runstate.h
index bffc3719d4..af33ea92b6 100644
--- a/include/system/runstate.h
+++ b/include/system/runstate.h
@@ -12,6 +12,7 @@ bool runstate_needs_reset(void);
 void runstate_replay_enable(void);
 
 typedef void VMChangeStateHandler(void *opaque, bool running, RunState state);
+typedef int VMChangeStateHandlerExt(void *opaque, bool running, RunState state);
 
 VMChangeStateEntry *qemu_add_vm_change_state_handler(VMChangeStateHandler *cb,
                                                      void *opaque);
@@ -20,21 +21,25 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
 VMChangeStateEntry *
 qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
                                            VMChangeStateHandler *prepare_cb,
+                                           VMChangeStateHandlerExt *cb_ext,
                                            void *opaque, int priority);
 VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
                                                      VMChangeStateHandler *cb,
+                                                     VMChangeStateHandlerExt *cb_ext,
                                                      void *opaque);
 VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
-    DeviceState *dev, VMChangeStateHandler *cb,
-    VMChangeStateHandler *prepare_cb, void *opaque);
+    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
+    VMChangeStateHandlerExt *cb_ext, void *opaque);
 void qemu_del_vm_change_state_handler(VMChangeStateEntry *e);
 /**
  * vm_state_notify: Notify the state of the VM
  *
  * @running: whether the VM is running or not.
  * @state: the #RunState of the VM.
+ *
+ * Return the result of the callback which has return value.
  */
-void vm_state_notify(bool running, RunState state);
+int vm_state_notify(bool running, RunState state);
 
 static inline bool shutdown_caused_by_guest(ShutdownCause cause)
 {
diff --git a/system/cpus.c b/system/cpus.c
index 2cc5f887ab..6e1cf5720f 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -299,14 +299,14 @@ static int do_vm_stop(RunState state, bool send_stop)
         if (oldstate == RUN_STATE_RUNNING) {
             pause_all_vcpus();
         }
-        vm_state_notify(0, state);
+        ret = vm_state_notify(0, state);
         if (send_stop) {
             qapi_event_send_stop();
         }
     }
 
     bdrv_drain_all();
-    ret = bdrv_flush_all();
+    ret |= bdrv_flush_all();
     trace_vm_stop_flush_all(ret);
 
     return ret;
diff --git a/system/runstate.c b/system/runstate.c
index 272801d307..2219cec409 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -297,6 +297,7 @@ void qemu_system_vmstop_request(RunState state)
 struct VMChangeStateEntry {
     VMChangeStateHandler *cb;
     VMChangeStateHandler *prepare_cb;
+    VMChangeStateHandlerExt *cb_ext;
     void *opaque;
     QTAILQ_ENTRY(VMChangeStateEntry) entries;
     int priority;
@@ -320,14 +321,15 @@ static QTAILQ_HEAD(, VMChangeStateEntry) vm_change_state_head =
 VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
         VMChangeStateHandler *cb, void *opaque, int priority)
 {
-    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, opaque,
-                                                      priority);
+    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, NULL,
+                                                      opaque, priority);
 }
 
 /**
  * qemu_add_vm_change_state_handler_prio_full:
  * @cb: the main callback to invoke
  * @prepare_cb: a callback to invoke before the main callback
+ * @cb_ext: the main callback to invoke with return value
  * @opaque: user data passed to the callbacks
  * @priority: low priorities execute first when the vm runs and the reverse is
  *            true when the vm stops
@@ -344,6 +346,7 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
 VMChangeStateEntry *
 qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
                                            VMChangeStateHandler *prepare_cb,
+                                           VMChangeStateHandlerExt *cb_ext,
                                            void *opaque, int priority)
 {
     VMChangeStateEntry *e;
@@ -352,6 +355,7 @@ qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
     e = g_malloc0(sizeof(*e));
     e->cb = cb;
     e->prepare_cb = prepare_cb;
+    e->cb_ext = cb_ext;
     e->opaque = opaque;
     e->priority = priority;
 
@@ -379,9 +383,10 @@ void qemu_del_vm_change_state_handler(VMChangeStateEntry *e)
     g_free(e);
 }
 
-void vm_state_notify(bool running, RunState state)
+int vm_state_notify(bool running, RunState state)
 {
     VMChangeStateEntry *e, *next;
+    int ret = 0;
 
     trace_vm_state_notify(running, state, RunState_str(state));
 
@@ -393,7 +398,12 @@ void vm_state_notify(bool running, RunState state)
         }
 
         QTAILQ_FOREACH_SAFE(e, &vm_change_state_head, entries, next) {
-            e->cb(e->opaque, running, state);
+            if (e->cb) {
+                e->cb(e->opaque, running, state);
+            } else if (e->cb_ext) {
+                // no need to process the result when starting VM
+                e->cb_ext(e->opaque, running, state);
+            }
         }
     } else {
         QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
@@ -403,9 +413,14 @@ void vm_state_notify(bool running, RunState state)
         }
 
         QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
-            e->cb(e->opaque, running, state);
+            if (e->cb) {
+                e->cb(e->opaque, running, state);
+            } else if (e->cb_ext) {
+                ret |= e->cb_ext(e->opaque, running, state);
+            }
         }
     }
+    return ret;
 }
 
 static ShutdownCause reset_requested;
-- 
2.48.1

Re: [PATCH v2 1/3] virtio: add VM state change cb with return value

Posted by Stefano Garzarella 1 week, 6 days ago

On Fri, Mar 14, 2025 at 06:15:32AM -0400, Haoqian He wrote:
>This patch contains two changes:
>
>1. Add VM state change cb type VMChangeStateHandlerExt which has return
>value for virtio devices VMChangeStateEntry. When VM state changes,
>virtio device will call the _Ext version.
>
>2. Add return value for vm_state_notify().

Can you explain why these changes are needed?

>
>Signed-off-by: Haoqian He <haoqian.he@smartx.com>
>---
> hw/block/virtio-blk.c             |  2 +-
> hw/core/vm-change-state-handler.c | 14 ++++++++------
> hw/scsi/scsi-bus.c                |  2 +-
> hw/vfio/migration.c               |  2 +-
> hw/virtio/virtio.c                |  5 +++--
> include/system/runstate.h         | 11 ++++++++---
> system/cpus.c                     |  4 ++--
> system/runstate.c                 | 25 ++++++++++++++++++++-----
> 8 files changed, 44 insertions(+), 21 deletions(-)
>
>diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>index 5135b4d8f1..4a48a16790 100644
>--- a/hw/block/virtio-blk.c
>+++ b/hw/block/virtio-blk.c
>@@ -1928,7 +1928,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
>      * called after ->start_ioeventfd() has already set blk's AioContext.
>      */
>     s->change =
>-        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, s);
>+        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, NULL, s);
>
>     blk_ram_registrar_init(&s->blk_ram_registrar, s->blk);
>     blk_set_dev_ops(s->blk, &virtio_block_ops, s);
>diff --git a/hw/core/vm-change-state-handler.c b/hw/core/vm-change-state-handler.c
>index 7064995578..d5045b17c1 100644
>--- a/hw/core/vm-change-state-handler.c
>+++ b/hw/core/vm-change-state-handler.c
>@@ -40,6 +40,7 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
>  * qdev_add_vm_change_state_handler:
>  * @dev: the device that owns this handler
>  * @cb: the callback function to be invoked
>+ * @cb_ext: the callback function with return value to be invoked

I think we need to document what happens if both `cb` and `cb_ext` are 
specified.

Also `cb_ext` is very cryptic, I'm not good with names, but what about 
something like `cb_ret`?

>  * @opaque: user data passed to the callback function
>  *
>  * This function works like qemu_add_vm_change_state_handler() except callbacks
>@@ -54,21 +55,22 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
>  */
> VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
>                                                      VMChangeStateHandler *cb,
>+                                                     VMChangeStateHandlerExt *cb_ext,
>                                                      void *opaque)
> {
>-    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, opaque);
>+    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, cb_ext, opaque);
> }
>
> /*
>  * Exactly like qdev_add_vm_change_state_handler() but passes a prepare_cb
>- * argument too.
>+ * and the cb_ext arguments too.
>  */
> VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
>-    DeviceState *dev, VMChangeStateHandler *cb,
>-    VMChangeStateHandler *prepare_cb, void *opaque)
>+    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
>+    VMChangeStateHandlerExt *cb_ext, void *opaque)

> {
>     int depth = qdev_get_dev_tree_depth(dev);
>
>-    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, opaque,
>-                                                      depth);
>+    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, cb_ext,
>+                                                      opaque, depth);
> }
>diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
>index 7d4546800f..ec098f5f0a 100644
>--- a/hw/scsi/scsi-bus.c
>+++ b/hw/scsi/scsi-bus.c
>@@ -356,7 +356,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
>         return;
>     }
>     dev->vmsentry = qdev_add_vm_change_state_handler(DEVICE(dev),
>-            scsi_dma_restart_cb, dev);
>+            scsi_dma_restart_cb, NULL, dev);
> }
>
> static void scsi_qdev_unrealize(DeviceState *qdev)
>diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>index 416643ddd6..f531db83ea 100644
>--- a/hw/vfio/migration.c
>+++ b/hw/vfio/migration.c
>@@ -1015,7 +1015,7 @@ static int vfio_migration_init(VFIODevice *vbasedev)
>                      vfio_vmstate_change_prepare :
>                      NULL;
>     migration->vm_state = qdev_add_vm_change_state_handler_full(
>-        vbasedev->dev, vfio_vmstate_change, prepare_cb, vbasedev);
>+        vbasedev->dev, vfio_vmstate_change, prepare_cb, NULL, vbasedev);
>     migration_add_notifier(&migration->migration_state,
>                            vfio_migration_state_notifier);
>
>diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>index 85110bce37..5e8d4cab53 100644
>--- a/hw/virtio/virtio.c
>+++ b/hw/virtio/virtio.c
>@@ -3419,7 +3419,7 @@ void virtio_cleanup(VirtIODevice *vdev)
>     qemu_del_vm_change_state_handler(vdev->vmstate);
> }
>
>-static void virtio_vmstate_change(void *opaque, bool running, RunState state)
>+static int virtio_vmstate_change(void *opaque, bool running, RunState state)
> {
>     VirtIODevice *vdev = opaque;
>     BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
>@@ -3438,6 +3438,7 @@ static void virtio_vmstate_change(void *opaque, bool running, RunState state)
>     if (!backend_run) {
>         virtio_set_status(vdev, vdev->status);
>     }
>+    return 0;
> }
>
> void virtio_instance_init_common(Object *proxy_obj, void *data,
>@@ -3489,7 +3490,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
>         vdev->config = NULL;
>     }
>     vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
>-            virtio_vmstate_change, vdev);
>+            NULL, virtio_vmstate_change, vdev);

IIUC virtio_vmstate_change now returns always 0, so what is the point of
this change?

I'd also do this change in a separate commit if it's really needed.

>     vdev->device_endian = virtio_default_endian();
>     vdev->use_guest_notifier_mask = true;
> }
>diff --git a/include/system/runstate.h b/include/system/runstate.h
>index bffc3719d4..af33ea92b6 100644
>--- a/include/system/runstate.h
>+++ b/include/system/runstate.h
>@@ -12,6 +12,7 @@ bool runstate_needs_reset(void);
> void runstate_replay_enable(void);
>
> typedef void VMChangeStateHandler(void *opaque, bool running, RunState state);
>+typedef int VMChangeStateHandlerExt(void *opaque, bool running, RunState state);

Ditto about "Ext"

>
> VMChangeStateEntry *qemu_add_vm_change_state_handler(VMChangeStateHandler *cb,
>                                                      void *opaque);
>@@ -20,21 +21,25 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
> VMChangeStateEntry *
> qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
>                                            VMChangeStateHandler *prepare_cb,
>+                                           VMChangeStateHandlerExt *cb_ext,
>                                            void *opaque, int priority);
> VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
>                                                      VMChangeStateHandler *cb,
>+                                                     VMChangeStateHandlerExt *cb_ext,
>                                                      void *opaque);
> VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
>-    DeviceState *dev, VMChangeStateHandler *cb,
>-    VMChangeStateHandler *prepare_cb, void *opaque);
>+    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
>+    VMChangeStateHandlerExt *cb_ext, void *opaque);
> void qemu_del_vm_change_state_handler(VMChangeStateEntry *e);
> /**
>  * vm_state_notify: Notify the state of the VM
>  *
>  * @running: whether the VM is running or not.
>  * @state: the #RunState of the VM.
>+ *
>+ * Return the result of the callback which has return value.

What if the callback has no return value?

>  */
>-void vm_state_notify(bool running, RunState state);
>+int vm_state_notify(bool running, RunState state);
>
> static inline bool shutdown_caused_by_guest(ShutdownCause cause)
> {
>diff --git a/system/cpus.c b/system/cpus.c
>index 2cc5f887ab..6e1cf5720f 100644
>--- a/system/cpus.c
>+++ b/system/cpus.c
>@@ -299,14 +299,14 @@ static int do_vm_stop(RunState state, bool send_stop)
>         if (oldstate == RUN_STATE_RUNNING) {
>             pause_all_vcpus();
>         }
>-        vm_state_notify(0, state);
>+        ret = vm_state_notify(0, state);
>         if (send_stop) {
>             qapi_event_send_stop();
>         }
>     }
>
>     bdrv_drain_all();
>-    ret = bdrv_flush_all();
>+    ret |= bdrv_flush_all();

Are we sure this is the right thing to do?
If vm_state_notify() failed, shouldn't we go out first, and then why put 
them in or?

I think it should be explained in the commit or in a comment here.

>     trace_vm_stop_flush_all(ret);
>
>     return ret;
>diff --git a/system/runstate.c b/system/runstate.c
>index 272801d307..2219cec409 100644
>--- a/system/runstate.c
>+++ b/system/runstate.c
>@@ -297,6 +297,7 @@ void qemu_system_vmstop_request(RunState state)
> struct VMChangeStateEntry {
>     VMChangeStateHandler *cb;
>     VMChangeStateHandler *prepare_cb;
>+    VMChangeStateHandlerExt *cb_ext;
>     void *opaque;
>     QTAILQ_ENTRY(VMChangeStateEntry) entries;
>     int priority;
>@@ -320,14 +321,15 @@ static QTAILQ_HEAD(, VMChangeStateEntry) vm_change_state_head =
> VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
>         VMChangeStateHandler *cb, void *opaque, int priority)
> {
>-    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, opaque,
>-                                                      priority);
>+    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, NULL,
>+                                                      opaque, priority);
> }
>
> /**
>  * qemu_add_vm_change_state_handler_prio_full:
>  * @cb: the main callback to invoke
>  * @prepare_cb: a callback to invoke before the main callback
>+ * @cb_ext: the main callback to invoke with return value
>  * @opaque: user data passed to the callbacks
>  * @priority: low priorities execute first when the vm runs and the reverse is
>  *            true when the vm stops
>@@ -344,6 +346,7 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
> VMChangeStateEntry *
> qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
>                                            VMChangeStateHandler *prepare_cb,
>+                                           VMChangeStateHandlerExt *cb_ext,
>                                            void *opaque, int priority)
> {
>     VMChangeStateEntry *e;
>@@ -352,6 +355,7 @@ qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
>     e = g_malloc0(sizeof(*e));
>     e->cb = cb;
>     e->prepare_cb = prepare_cb;
>+    e->cb_ext = cb_ext;
>     e->opaque = opaque;
>     e->priority = priority;
>
>@@ -379,9 +383,10 @@ void qemu_del_vm_change_state_handler(VMChangeStateEntry *e)
>     g_free(e);
> }
>
>-void vm_state_notify(bool running, RunState state)
>+int vm_state_notify(bool running, RunState state)
> {
>     VMChangeStateEntry *e, *next;
>+    int ret = 0;
>
>     trace_vm_state_notify(running, state, RunState_str(state));
>
>@@ -393,7 +398,12 @@ void vm_state_notify(bool running, RunState state)
>         }
>
>         QTAILQ_FOREACH_SAFE(e, &vm_change_state_head, entries, next) {
>-            e->cb(e->opaque, running, state);
>+            if (e->cb) {
>+                e->cb(e->opaque, running, state);
>+            } else if (e->cb_ext) {
>+                // no need to process the result when starting VM

Why?

(a good comment should explain more why than what we're doing which is 
pretty clear)

>+                e->cb_ext(e->opaque, running, state);
>+            }
>         }
>     } else {
>         QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
>@@ -403,9 +413,14 @@ void vm_state_notify(bool running, RunState state)
>         }
>
>         QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
>-            e->cb(e->opaque, running, state);
>+            if (e->cb) {
>+                e->cb(e->opaque, running, state);
>+            } else if (e->cb_ext) {
>+                ret |= e->cb_ext(e->opaque, running, state);

I think putting them in or should be documented or at least explained 
somewhere. It's not clear to me, but it's true that I don't know this 
code.

>+            }
>         }
>     }
>+    return ret;
> }
>
> static ShutdownCause reset_requested;
>-- 
>2.48.1
>

Re: [PATCH v2 1/3] virtio: add VM state change cb with return value

Posted by Haoqian He 1 week, 6 days ago

> 2025年3月19日 22:50，Stefano Garzarella <sgarzare@redhat.com> 写道：
> 
> On Fri, Mar 14, 2025 at 06:15:32AM -0400, Haoqian He wrote:
>> This patch contains two changes:
>> 
>> 1. Add VM state change cb type VMChangeStateHandlerExt which has return
>> value for virtio devices VMChangeStateEntry. When VM state changes,
>> virtio device will call the _Ext version.
>> 
>> 2. Add return value for vm_state_notify().
> 
> Can you explain why these changes are needed?

[1] The first and second patches are both pre-patch of the third patch, and the reason
for these changes is in the email cover and the third patch commit message.
I will briefly explain it in the commit message of the pre-patch.

> 
>> 
>> Signed-off-by: Haoqian He <haoqian.he@smartx.com>
>> ---
>> hw/block/virtio-blk.c             |  2 +-
>> hw/core/vm-change-state-handler.c | 14 ++++++++------
>> hw/scsi/scsi-bus.c                |  2 +-
>> hw/vfio/migration.c               |  2 +-
>> hw/virtio/virtio.c                |  5 +++--
>> include/system/runstate.h         | 11 ++++++++---
>> system/cpus.c                     |  4 ++--
>> system/runstate.c                 | 25 ++++++++++++++++++++-----
>> 8 files changed, 44 insertions(+), 21 deletions(-)
>> 
>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>> index 5135b4d8f1..4a48a16790 100644
>> --- a/hw/block/virtio-blk.c
>> +++ b/hw/block/virtio-blk.c
>> @@ -1928,7 +1928,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
>>     * called after ->start_ioeventfd() has already set blk's AioContext.
>>     */
>>    s->change =
>> -        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, s);
>> +        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, NULL, s);
>> 
>>    blk_ram_registrar_init(&s->blk_ram_registrar, s->blk);
>>    blk_set_dev_ops(s->blk, &virtio_block_ops, s);
>> diff --git a/hw/core/vm-change-state-handler.c b/hw/core/vm-change-state-handler.c
>> index 7064995578..d5045b17c1 100644
>> --- a/hw/core/vm-change-state-handler.c
>> +++ b/hw/core/vm-change-state-handler.c
>> @@ -40,6 +40,7 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
>> * qdev_add_vm_change_state_handler:
>> * @dev: the device that owns this handler
>> * @cb: the callback function to be invoked
>> + * @cb_ext: the callback function with return value to be invoked
> 
> I think we need to document what happens if both `cb` and `cb_ext` are specified.

Indeed, it's best if `cb` and `cb_ext` are mutually exclusive, what about adding an
assert to ensure this.

> Also `cb_ext` is very cryptic, I'm not good with names, but what about something like `cb_ret`?

Acked.

> 
>> * @opaque: user data passed to the callback function
>> *
>> * This function works like qemu_add_vm_change_state_handler() except callbacks
>> @@ -54,21 +55,22 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
>> */
>> VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
>>                                                     VMChangeStateHandler *cb,
>> +                                                     VMChangeStateHandlerExt *cb_ext,
>>                                                     void *opaque)
>> {
>> -    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, opaque);
>> +    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, cb_ext, opaque);
>> }
>> 
>> /*
>> * Exactly like qdev_add_vm_change_state_handler() but passes a prepare_cb
>> - * argument too.
>> + * and the cb_ext arguments too.
>> */
>> VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
>> -    DeviceState *dev, VMChangeStateHandler *cb,
>> -    VMChangeStateHandler *prepare_cb, void *opaque)
>> +    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
>> +    VMChangeStateHandlerExt *cb_ext, void *opaque)
> 
>> {
>>    int depth = qdev_get_dev_tree_depth(dev);
>> 
>> -    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, opaque,
>> -                                                      depth);
>> +    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, cb_ext,
>> +                                                      opaque, depth);
>> }
>> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
>> index 7d4546800f..ec098f5f0a 100644
>> --- a/hw/scsi/scsi-bus.c
>> +++ b/hw/scsi/scsi-bus.c
>> @@ -356,7 +356,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
>>        return;
>>    }
>>    dev->vmsentry = qdev_add_vm_change_state_handler(DEVICE(dev),
>> -            scsi_dma_restart_cb, dev);
>> +            scsi_dma_restart_cb, NULL, dev);
>> }
>> 
>> static void scsi_qdev_unrealize(DeviceState *qdev)
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 416643ddd6..f531db83ea 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -1015,7 +1015,7 @@ static int vfio_migration_init(VFIODevice *vbasedev)
>>                     vfio_vmstate_change_prepare :
>>                     NULL;
>>    migration->vm_state = qdev_add_vm_change_state_handler_full(
>> -        vbasedev->dev, vfio_vmstate_change, prepare_cb, vbasedev);
>> +        vbasedev->dev, vfio_vmstate_change, prepare_cb, NULL, vbasedev);
>>    migration_add_notifier(&migration->migration_state,
>>                           vfio_migration_state_notifier);
>> 
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 85110bce37..5e8d4cab53 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -3419,7 +3419,7 @@ void virtio_cleanup(VirtIODevice *vdev)
>>    qemu_del_vm_change_state_handler(vdev->vmstate);
>> }
>> 
>> -static void virtio_vmstate_change(void *opaque, bool running, RunState state)
>> +static int virtio_vmstate_change(void *opaque, bool running, RunState state)
>> {
>>    VirtIODevice *vdev = opaque;
>>    BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
>> @@ -3438,6 +3438,7 @@ static void virtio_vmstate_change(void *opaque, bool running, RunState state)
>>    if (!backend_run) {
>>        virtio_set_status(vdev, vdev->status);
>>    }
>> +    return 0;
>> }
>> 
>> void virtio_instance_init_common(Object *proxy_obj, void *data,
>> @@ -3489,7 +3490,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
>>        vdev->config = NULL;
>>    }
>>    vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
>> -            virtio_vmstate_change, vdev);
>> +            NULL, virtio_vmstate_change, vdev);
> 
> IIUC virtio_vmstate_change now returns always 0, so what is the point of
> this change?
> 
> I'd also do this change in a separate commit if it's really needed.

According to [1], the pre-patch just adds a return value for vm_state_notify,
and the next patches returns an error in the virtio and vhost-user code.

> 
>>    vdev->device_endian = virtio_default_endian();
>>    vdev->use_guest_notifier_mask = true;
>> }
>> diff --git a/include/system/runstate.h b/include/system/runstate.h
>> index bffc3719d4..af33ea92b6 100644
>> --- a/include/system/runstate.h
>> +++ b/include/system/runstate.h
>> @@ -12,6 +12,7 @@ bool runstate_needs_reset(void);
>> void runstate_replay_enable(void);
>> 
>> typedef void VMChangeStateHandler(void *opaque, bool running, RunState state);
>> +typedef int VMChangeStateHandlerExt(void *opaque, bool running, RunState state);
> 
> Ditto about "Ext"
> 
>> 
>> VMChangeStateEntry *qemu_add_vm_change_state_handler(VMChangeStateHandler *cb,
>>                                                     void *opaque);
>> @@ -20,21 +21,25 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
>> VMChangeStateEntry *
>> qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
>>                                           VMChangeStateHandler *prepare_cb,
>> +                                           VMChangeStateHandlerExt *cb_ext,
>>                                           void *opaque, int priority);
>> VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
>>                                                     VMChangeStateHandler *cb,
>> +                                                     VMChangeStateHandlerExt *cb_ext,
>>                                                     void *opaque);
>> VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
>> -    DeviceState *dev, VMChangeStateHandler *cb,
>> -    VMChangeStateHandler *prepare_cb, void *opaque);
>> +    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
>> +    VMChangeStateHandlerExt *cb_ext, void *opaque);
>> void qemu_del_vm_change_state_handler(VMChangeStateEntry *e);
>> /**
>> * vm_state_notify: Notify the state of the VM
>> *
>> * @running: whether the VM is running or not.
>> * @state: the #RunState of the VM.
>> + *
>> + * Return the result of the callback which has return value.
> 
> What if the callback has no return value?

vm_state_notify must have a return value. If all cb have no return value, it
will return 0 and the upper layer will do no additional processing as before.

> 
>> */
>> -void vm_state_notify(bool running, RunState state);
>> +int vm_state_notify(bool running, RunState state);
>> 
>> static inline bool shutdown_caused_by_guest(ShutdownCause cause)
>> {
>> diff --git a/system/cpus.c b/system/cpus.c
>> index 2cc5f887ab..6e1cf5720f 100644
>> --- a/system/cpus.c
>> +++ b/system/cpus.c
>> @@ -299,14 +299,14 @@ static int do_vm_stop(RunState state, bool send_stop)
>>        if (oldstate == RUN_STATE_RUNNING) {
>>            pause_all_vcpus();
>>        }
>> -        vm_state_notify(0, state);
>> +        ret = vm_state_notify(0, state);
>>        if (send_stop) {
>>            qapi_event_send_stop();
>>        }
>>    }
>> 
>>    bdrv_drain_all();
>> -    ret = bdrv_flush_all();
>> +    ret |= bdrv_flush_all();
> 
> Are we sure this is the right thing to do?
> If vm_state_notify() failed, shouldn't we go out first, and then why put them in or?
> 
> I think it should be explained in the commit or in a comment here.

Even if vm_state_notify() failed, it might be better to flush as before.

> 
>>    trace_vm_stop_flush_all(ret);
>> 
>>    return ret;
>> diff --git a/system/runstate.c b/system/runstate.c
>> index 272801d307..2219cec409 100644
>> --- a/system/runstate.c
>> +++ b/system/runstate.c
>> @@ -297,6 +297,7 @@ void qemu_system_vmstop_request(RunState state)
>> struct VMChangeStateEntry {
>>    VMChangeStateHandler *cb;
>>    VMChangeStateHandler *prepare_cb;
>> +    VMChangeStateHandlerExt *cb_ext;
>>    void *opaque;
>>    QTAILQ_ENTRY(VMChangeStateEntry) entries;
>>    int priority;
>> @@ -320,14 +321,15 @@ static QTAILQ_HEAD(, VMChangeStateEntry) vm_change_state_head =
>> VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
>>        VMChangeStateHandler *cb, void *opaque, int priority)
>> {
>> -    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, opaque,
>> -                                                      priority);
>> +    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, NULL,
>> +                                                      opaque, priority);
>> }
>> 
>> /**
>> * qemu_add_vm_change_state_handler_prio_full:
>> * @cb: the main callback to invoke
>> * @prepare_cb: a callback to invoke before the main callback
>> + * @cb_ext: the main callback to invoke with return value
>> * @opaque: user data passed to the callbacks
>> * @priority: low priorities execute first when the vm runs and the reverse is
>> *            true when the vm stops
>> @@ -344,6 +346,7 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
>> VMChangeStateEntry *
>> qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
>>                                           VMChangeStateHandler *prepare_cb,
>> +                                           VMChangeStateHandlerExt *cb_ext,
>>                                           void *opaque, int priority)
>> {
>>    VMChangeStateEntry *e;
>> @@ -352,6 +355,7 @@ qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
>>    e = g_malloc0(sizeof(*e));
>>    e->cb = cb;
>>    e->prepare_cb = prepare_cb;
>> +    e->cb_ext = cb_ext;
>>    e->opaque = opaque;
>>    e->priority = priority;
>> 
>> @@ -379,9 +383,10 @@ void qemu_del_vm_change_state_handler(VMChangeStateEntry *e)
>>    g_free(e);
>> }
>> 
>> -void vm_state_notify(bool running, RunState state)
>> +int vm_state_notify(bool running, RunState state)
>> {
>>    VMChangeStateEntry *e, *next;
>> +    int ret = 0;
>> 
>>    trace_vm_state_notify(running, state, RunState_str(state));
>> 
>> @@ -393,7 +398,12 @@ void vm_state_notify(bool running, RunState state)
>>        }
>> 
>>        QTAILQ_FOREACH_SAFE(e, &vm_change_state_head, entries, next) {
>> -            e->cb(e->opaque, running, state);
>> +            if (e->cb) {
>> +                e->cb(e->opaque, running, state);
>> +            } else if (e->cb_ext) {
>> +                // no need to process the result when starting VM
> 
> Why?
> 
> (a good comment should explain more why than what we're doing which is pretty clear)

Acked.
Since we only want to know the return value of callback when the stopping device
live migration.

> 
>> +                e->cb_ext(e->opaque, running, state);
>> +            }
>>        }
>>    } else {
>>        QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
>> @@ -403,9 +413,14 @@ void vm_state_notify(bool running, RunState state)
>>        }
>> 
>>        QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
>> -            e->cb(e->opaque, running, state);
>> +            if (e->cb) {
>> +                e->cb(e->opaque, running, state);
>> +            } else if (e->cb_ext) {
>> +                ret |= e->cb_ext(e->opaque, running, state);
> 
> I think putting them in or should be documented or at least explained somewhere. It's not clear to me, but it's true that I don't know this code.

Let's add some comment here.

> 
>> +            }
>>        }
>>    }
>> +    return ret;
>> }
>> 
>> static ShutdownCause reset_requested;
>> -- 
>> 2.48.1

Re: [PATCH v2 1/3] virtio: add VM state change cb with return value

Posted by Stefano Garzarella 1 week, 1 day ago

On Thu, Mar 20, 2025 at 08:21:18PM +0800, Haoqian He wrote:
>
>> 2025年3月19日 22:50，Stefano Garzarella <sgarzare@redhat.com> 写道：
>>
>> On Fri, Mar 14, 2025 at 06:15:32AM -0400, Haoqian He wrote:
>>> This patch contains two changes:
>>>
>>> 1. Add VM state change cb type VMChangeStateHandlerExt which has return
>>> value for virtio devices VMChangeStateEntry. When VM state changes,
>>> virtio device will call the _Ext version.
>>>
>>> 2. Add return value for vm_state_notify().
>>
>> Can you explain why these changes are needed?
>
>[1] The first and second patches are both pre-patch of the third patch, and the reason
>for these changes is in the email cover and the third patch commit message.
>I will briefly explain it in the commit message of the pre-patch.

Yes, please, it's not so much for me, but for the future, a good commit 
message should explain the reason for the changes.

>
>>
>>>
>>> Signed-off-by: Haoqian He <haoqian.he@smartx.com>
>>> ---
>>> hw/block/virtio-blk.c             |  2 +-
>>> hw/core/vm-change-state-handler.c | 14 ++++++++------
>>> hw/scsi/scsi-bus.c                |  2 +-
>>> hw/vfio/migration.c               |  2 +-
>>> hw/virtio/virtio.c                |  5 +++--
>>> include/system/runstate.h         | 11 ++++++++---
>>> system/cpus.c                     |  4 ++--
>>> system/runstate.c                 | 25 ++++++++++++++++++++-----
>>> 8 files changed, 44 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>>> index 5135b4d8f1..4a48a16790 100644
>>> --- a/hw/block/virtio-blk.c
>>> +++ b/hw/block/virtio-blk.c
>>> @@ -1928,7 +1928,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
>>>     * called after ->start_ioeventfd() has already set blk's AioContext.
>>>     */
>>>    s->change =
>>> -        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, s);
>>> +        qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, NULL, s);
>>>
>>>    blk_ram_registrar_init(&s->blk_ram_registrar, s->blk);
>>>    blk_set_dev_ops(s->blk, &virtio_block_ops, s);
>>> diff --git a/hw/core/vm-change-state-handler.c b/hw/core/vm-change-state-handler.c
>>> index 7064995578..d5045b17c1 100644
>>> --- a/hw/core/vm-change-state-handler.c
>>> +++ b/hw/core/vm-change-state-handler.c
>>> @@ -40,6 +40,7 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
>>> * qdev_add_vm_change_state_handler:
>>> * @dev: the device that owns this handler
>>> * @cb: the callback function to be invoked
>>> + * @cb_ext: the callback function with return value to be invoked
>>
>> I think we need to document what happens if both `cb` and `cb_ext` are specified.
>
>Indeed, it's best if `cb` and `cb_ext` are mutually exclusive, what about adding an
>assert to ensure this.

Yep, documenation + assert should be fine.

>
>> Also `cb_ext` is very cryptic, I'm not good with names, but what about something like `cb_ret`?
>
>Acked.
>
>>
>>> * @opaque: user data passed to the callback function
>>> *
>>> * This function works like qemu_add_vm_change_state_handler() except callbacks
>>> @@ -54,21 +55,22 @@ static int qdev_get_dev_tree_depth(DeviceState *dev)
>>> */
>>> VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
>>>                                                     VMChangeStateHandler *cb,
>>> +                                                     VMChangeStateHandlerExt *cb_ext,
>>>                                                     void *opaque)
>>> {
>>> -    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, opaque);
>>> +    return qdev_add_vm_change_state_handler_full(dev, cb, NULL, cb_ext, opaque);
>>> }
>>>
>>> /*
>>> * Exactly like qdev_add_vm_change_state_handler() but passes a prepare_cb
>>> - * argument too.
>>> + * and the cb_ext arguments too.
>>> */
>>> VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
>>> -    DeviceState *dev, VMChangeStateHandler *cb,
>>> -    VMChangeStateHandler *prepare_cb, void *opaque)
>>> +    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
>>> +    VMChangeStateHandlerExt *cb_ext, void *opaque)
>>
>>> {
>>>    int depth = qdev_get_dev_tree_depth(dev);
>>>
>>> -    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, opaque,
>>> -                                                      depth);
>>> +    return qemu_add_vm_change_state_handler_prio_full(cb, prepare_cb, cb_ext,
>>> +                                                      opaque, depth);
>>> }
>>> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
>>> index 7d4546800f..ec098f5f0a 100644
>>> --- a/hw/scsi/scsi-bus.c
>>> +++ b/hw/scsi/scsi-bus.c
>>> @@ -356,7 +356,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
>>>        return;
>>>    }
>>>    dev->vmsentry = qdev_add_vm_change_state_handler(DEVICE(dev),
>>> -            scsi_dma_restart_cb, dev);
>>> +            scsi_dma_restart_cb, NULL, dev);
>>> }
>>>
>>> static void scsi_qdev_unrealize(DeviceState *qdev)
>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>> index 416643ddd6..f531db83ea 100644
>>> --- a/hw/vfio/migration.c
>>> +++ b/hw/vfio/migration.c
>>> @@ -1015,7 +1015,7 @@ static int vfio_migration_init(VFIODevice *vbasedev)
>>>                     vfio_vmstate_change_prepare :
>>>                     NULL;
>>>    migration->vm_state = qdev_add_vm_change_state_handler_full(
>>> -        vbasedev->dev, vfio_vmstate_change, prepare_cb, vbasedev);
>>> +        vbasedev->dev, vfio_vmstate_change, prepare_cb, NULL, vbasedev);
>>>    migration_add_notifier(&migration->migration_state,
>>>                           vfio_migration_state_notifier);
>>>
>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>>> index 85110bce37..5e8d4cab53 100644
>>> --- a/hw/virtio/virtio.c
>>> +++ b/hw/virtio/virtio.c
>>> @@ -3419,7 +3419,7 @@ void virtio_cleanup(VirtIODevice *vdev)
>>>    qemu_del_vm_change_state_handler(vdev->vmstate);
>>> }
>>>
>>> -static void virtio_vmstate_change(void *opaque, bool running, RunState state)
>>> +static int virtio_vmstate_change(void *opaque, bool running, RunState state)
>>> {
>>>    VirtIODevice *vdev = opaque;
>>>    BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
>>> @@ -3438,6 +3438,7 @@ static void virtio_vmstate_change(void *opaque, bool running, RunState state)
>>>    if (!backend_run) {
>>>        virtio_set_status(vdev, vdev->status);
>>>    }
>>> +    return 0;
>>> }
>>>
>>> void virtio_instance_init_common(Object *proxy_obj, void *data,
>>> @@ -3489,7 +3490,7 @@ void virtio_init(VirtIODevice *vdev, uint16_t device_id, size_t config_size)
>>>        vdev->config = NULL;
>>>    }
>>>    vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
>>> -            virtio_vmstate_change, vdev);
>>> +            NULL, virtio_vmstate_change, vdev);
>>
>> IIUC virtio_vmstate_change now returns always 0, so what is the point of
>> this change?
>>
>> I'd also do this change in a separate commit if it's really needed.
>
>According to [1], the pre-patch just adds a return value for vm_state_notify,
>and the next patches returns an error in the virtio and vhost-user code.

Why not changing this in the next patch where it can return an error?

>
>>
>>>    vdev->device_endian = virtio_default_endian();
>>>    vdev->use_guest_notifier_mask = true;
>>> }
>>> diff --git a/include/system/runstate.h b/include/system/runstate.h
>>> index bffc3719d4..af33ea92b6 100644
>>> --- a/include/system/runstate.h
>>> +++ b/include/system/runstate.h
>>> @@ -12,6 +12,7 @@ bool runstate_needs_reset(void);
>>> void runstate_replay_enable(void);
>>>
>>> typedef void VMChangeStateHandler(void *opaque, bool running, RunState state);
>>> +typedef int VMChangeStateHandlerExt(void *opaque, bool running, RunState state);
>>
>> Ditto about "Ext"
>>
>>>
>>> VMChangeStateEntry *qemu_add_vm_change_state_handler(VMChangeStateHandler *cb,
>>>                                                     void *opaque);
>>> @@ -20,21 +21,25 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
>>> VMChangeStateEntry *
>>> qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
>>>                                           VMChangeStateHandler *prepare_cb,
>>> +                                           VMChangeStateHandlerExt *cb_ext,
>>>                                           void *opaque, int priority);
>>> VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
>>>                                                     VMChangeStateHandler *cb,
>>> +                                                     VMChangeStateHandlerExt *cb_ext,
>>>                                                     void *opaque);
>>> VMChangeStateEntry *qdev_add_vm_change_state_handler_full(
>>> -    DeviceState *dev, VMChangeStateHandler *cb,
>>> -    VMChangeStateHandler *prepare_cb, void *opaque);
>>> +    DeviceState *dev, VMChangeStateHandler *cb, VMChangeStateHandler *prepare_cb,
>>> +    VMChangeStateHandlerExt *cb_ext, void *opaque);
>>> void qemu_del_vm_change_state_handler(VMChangeStateEntry *e);
>>> /**
>>> * vm_state_notify: Notify the state of the VM
>>> *
>>> * @running: whether the VM is running or not.
>>> * @state: the #RunState of the VM.
>>> + *
>>> + * Return the result of the callback which has return value.
>>
>> What if the callback has no return value?
>
>vm_state_notify must have a return value. If all cb have no return value, it
>will return 0 and the upper layer will do no additional processing as before.

Please, add that to the documentation.

>
>>
>>> */
>>> -void vm_state_notify(bool running, RunState state);
>>> +int vm_state_notify(bool running, RunState state);
>>>
>>> static inline bool shutdown_caused_by_guest(ShutdownCause cause)
>>> {
>>> diff --git a/system/cpus.c b/system/cpus.c
>>> index 2cc5f887ab..6e1cf5720f 100644
>>> --- a/system/cpus.c
>>> +++ b/system/cpus.c
>>> @@ -299,14 +299,14 @@ static int do_vm_stop(RunState state, bool send_stop)
>>>        if (oldstate == RUN_STATE_RUNNING) {
>>>            pause_all_vcpus();
>>>        }
>>> -        vm_state_notify(0, state);
>>> +        ret = vm_state_notify(0, state);
>>>        if (send_stop) {
>>>            qapi_event_send_stop();
>>>        }
>>>    }
>>>
>>>    bdrv_drain_all();
>>> -    ret = bdrv_flush_all();
>>> +    ret |= bdrv_flush_all();
>>
>> Are we sure this is the right thing to do?
>> If vm_state_notify() failed, shouldn't we go out first, and then why put them in or?
>>
>> I think it should be explained in the commit or in a comment here.
>
>Even if vm_state_notify() failed, it might be better to flush as 
>before.

Got it, please add a comment.

>
>>
>>>    trace_vm_stop_flush_all(ret);
>>>
>>>    return ret;
>>> diff --git a/system/runstate.c b/system/runstate.c
>>> index 272801d307..2219cec409 100644
>>> --- a/system/runstate.c
>>> +++ b/system/runstate.c
>>> @@ -297,6 +297,7 @@ void qemu_system_vmstop_request(RunState state)
>>> struct VMChangeStateEntry {
>>>    VMChangeStateHandler *cb;
>>>    VMChangeStateHandler *prepare_cb;
>>> +    VMChangeStateHandlerExt *cb_ext;
>>>    void *opaque;
>>>    QTAILQ_ENTRY(VMChangeStateEntry) entries;
>>>    int priority;
>>> @@ -320,14 +321,15 @@ static QTAILQ_HEAD(, VMChangeStateEntry) vm_change_state_head =
>>> VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
>>>        VMChangeStateHandler *cb, void *opaque, int priority)
>>> {
>>> -    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, opaque,
>>> -                                                      priority);
>>> +    return qemu_add_vm_change_state_handler_prio_full(cb, NULL, NULL,
>>> +                                                      opaque, priority);
>>> }
>>>
>>> /**
>>> * qemu_add_vm_change_state_handler_prio_full:
>>> * @cb: the main callback to invoke
>>> * @prepare_cb: a callback to invoke before the main callback
>>> + * @cb_ext: the main callback to invoke with return value
>>> * @opaque: user data passed to the callbacks
>>> * @priority: low priorities execute first when the vm runs and the reverse is
>>> *            true when the vm stops
>>> @@ -344,6 +346,7 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
>>> VMChangeStateEntry *
>>> qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
>>>                                           VMChangeStateHandler *prepare_cb,
>>> +                                           VMChangeStateHandlerExt *cb_ext,
>>>                                           void *opaque, int priority)
>>> {
>>>    VMChangeStateEntry *e;
>>> @@ -352,6 +355,7 @@ qemu_add_vm_change_state_handler_prio_full(VMChangeStateHandler *cb,
>>>    e = g_malloc0(sizeof(*e));
>>>    e->cb = cb;
>>>    e->prepare_cb = prepare_cb;
>>> +    e->cb_ext = cb_ext;
>>>    e->opaque = opaque;
>>>    e->priority = priority;
>>>
>>> @@ -379,9 +383,10 @@ void qemu_del_vm_change_state_handler(VMChangeStateEntry *e)
>>>    g_free(e);
>>> }
>>>
>>> -void vm_state_notify(bool running, RunState state)
>>> +int vm_state_notify(bool running, RunState state)
>>> {
>>>    VMChangeStateEntry *e, *next;
>>> +    int ret = 0;
>>>
>>>    trace_vm_state_notify(running, state, RunState_str(state));
>>>
>>> @@ -393,7 +398,12 @@ void vm_state_notify(bool running, RunState state)
>>>        }
>>>
>>>        QTAILQ_FOREACH_SAFE(e, &vm_change_state_head, entries, next) {
>>> -            e->cb(e->opaque, running, state);
>>> +            if (e->cb) {
>>> +                e->cb(e->opaque, running, state);
>>> +            } else if (e->cb_ext) {
>>> +                // no need to process the result when starting VM
>>
>> Why?
>>
>> (a good comment should explain more why than what we're doing which is pretty clear)
>
>Acked.
>Since we only want to know the return value of callback when the stopping device
>live migration.
>
>>
>>> +                e->cb_ext(e->opaque, running, state);
>>> +            }
>>>        }
>>>    } else {
>>>        QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
>>> @@ -403,9 +413,14 @@ void vm_state_notify(bool running, RunState state)
>>>        }
>>>
>>>        QTAILQ_FOREACH_REVERSE_SAFE(e, &vm_change_state_head, entries, next) {
>>> -            e->cb(e->opaque, running, state);
>>> +            if (e->cb) {
>>> +                e->cb(e->opaque, running, state);
>>> +            } else if (e->cb_ext) {
>>> +                ret |= e->cb_ext(e->opaque, running, state);
>>
>> I think putting them in or should be documented or at least explained somewhere. It's not clear to me, but it's true that I don't know this code.
>
>Let's add some comment here.

Yep!

Thanks,
Stefano

>
>>
>>> +            }
>>>        }
>>>    }
>>> +    return ret;
>>> }
>>>
>>> static ShutdownCause reset_requested;
>>> --
>>> 2.48.1
>
>

[PATCH v2 2/3] vhost: return failure if stop virtqueue failed in vhost_dev_stop

Posted by Haoqian He 2 weeks, 5 days ago

The backend maybe crash when vhost_dev_stop and GET_VRING_BASE
would fail, we can return failure to indicate the connection
with the backend is broken.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 hw/virtio/vhost.c         | 27 +++++++++++++++------------
 include/hw/virtio/vhost.h |  8 +++++---
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 6aa72fd434..c82bbbe4cc 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1368,23 +1368,23 @@ fail_alloc_desc:
     return r;
 }
 
-void vhost_virtqueue_stop(struct vhost_dev *dev,
-                          struct VirtIODevice *vdev,
-                          struct vhost_virtqueue *vq,
-                          unsigned idx)
+int vhost_virtqueue_stop(struct vhost_dev *dev,
+                         struct VirtIODevice *vdev,
+                         struct vhost_virtqueue *vq,
+                         unsigned idx)
 {
     int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
     struct vhost_vring_state state = {
         .index = vhost_vq_index,
     };
-    int r;
+    int r = 0;
 
     if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
         /* Don't stop the virtqueue which might have not been started */
-        return;
+        return 0;
     }
 
-    r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
+    r |= dev->vhost_ops->vhost_get_vring_base(dev, &state);
     if (r < 0) {
         VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
         /* Connection to the backend is broken, so let's sync internal
@@ -1412,6 +1412,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
                        0, virtio_queue_get_avail_size(vdev, idx));
     vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
                        0, virtio_queue_get_desc_size(vdev, idx));
+    return r;
 }
 
 static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
@@ -2136,9 +2137,10 @@ fail_features:
 }
 
 /* Host notifiers must be enabled at this point. */
-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
 {
     int i;
+    int rc = 0;
 
     /* should only be called after backend is connected */
     assert(hdev->vhost_ops);
@@ -2157,10 +2159,10 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
         vhost_dev_set_vring_enable(hdev, false);
     }
     for (i = 0; i < hdev->nvqs; ++i) {
-        vhost_virtqueue_stop(hdev,
-                             vdev,
-                             hdev->vqs + i,
-                             hdev->vq_index + i);
+        rc |= vhost_virtqueue_stop(hdev,
+                                   vdev,
+                                   hdev->vqs + i,
+                                   hdev->vq_index + i);
     }
     if (hdev->vhost_ops->vhost_reset_status) {
         hdev->vhost_ops->vhost_reset_status(hdev);
@@ -2177,6 +2179,7 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
     hdev->started = false;
     vdev->vhost_started = false;
     hdev->vdev = NULL;
+    return rc;
 }
 
 int vhost_net_set_backend(struct vhost_dev *hdev,
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index a9469d50bc..fd96ec9c39 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -232,8 +232,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
  * Stop the vhost device. After the device is stopped the notifiers
  * can be disabled (@vhost_dev_disable_notifiers) and the device can
  * be torn down (@vhost_dev_cleanup).
+ *
+ * Return: 0 on success, != 0 on error when stopping dev.
  */
-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
 
 /**
  * DOC: vhost device configuration handling
@@ -333,8 +335,8 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
 
 int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
                           struct vhost_virtqueue *vq, unsigned idx);
-void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
-                          struct vhost_virtqueue *vq, unsigned idx);
+int vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
+                         struct vhost_virtqueue *vq, unsigned idx);
 
 void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
 void vhost_dev_free_inflight(struct vhost_inflight *inflight);
-- 
2.48.1

Re: [PATCH v2 2/3] vhost: return failure if stop virtqueue failed in vhost_dev_stop

Posted by Stefano Garzarella 1 week, 6 days ago

On Fri, Mar 14, 2025 at 06:15:33AM -0400, Haoqian He wrote:
>The backend maybe crash when vhost_dev_stop and GET_VRING_BASE
>would fail, we can return failure to indicate the connection
>with the backend is broken.
>
>Signed-off-by: Haoqian He <haoqian.he@smartx.com>
>---
> hw/virtio/vhost.c         | 27 +++++++++++++++------------
> include/hw/virtio/vhost.h |  8 +++++---
> 2 files changed, 20 insertions(+), 15 deletions(-)
>
>diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>index 6aa72fd434..c82bbbe4cc 100644
>--- a/hw/virtio/vhost.c
>+++ b/hw/virtio/vhost.c
>@@ -1368,23 +1368,23 @@ fail_alloc_desc:
>     return r;
> }
>
>-void vhost_virtqueue_stop(struct vhost_dev *dev,
>-                          struct VirtIODevice *vdev,
>-                          struct vhost_virtqueue *vq,
>-                          unsigned idx)
>+int vhost_virtqueue_stop(struct vhost_dev *dev,
>+                         struct VirtIODevice *vdev,
>+                         struct vhost_virtqueue *vq,
>+                         unsigned idx)
> {
>     int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
>     struct vhost_vring_state state = {
>         .index = vhost_vq_index,
>     };
>-    int r;
>+    int r = 0;
>
>     if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
>         /* Don't stop the virtqueue which might have not been started */
>-        return;
>+        return 0;
>     }
>
>-    r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
>+    r |= dev->vhost_ops->vhost_get_vring_base(dev, &state);

We can avoid this and also initialize r to 0.

>     if (r < 0) {
>         VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
>         /* Connection to the backend is broken, so let's sync internal
>@@ -1412,6 +1412,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
>                        0, virtio_queue_get_avail_size(vdev, idx));
>     vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
>                        0, virtio_queue_get_desc_size(vdev, idx));
>+    return r;
> }
>
> static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
>@@ -2136,9 +2137,10 @@ fail_features:
> }
>
> /* Host notifiers must be enabled at this point. */
>-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
> {
>     int i;
>+    int rc = 0;
>
>     /* should only be called after backend is connected */
>     assert(hdev->vhost_ops);
>@@ -2157,10 +2159,10 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>         vhost_dev_set_vring_enable(hdev, false);
>     }
>     for (i = 0; i < hdev->nvqs; ++i) {
>-        vhost_virtqueue_stop(hdev,
>-                             vdev,
>-                             hdev->vqs + i,
>-                             hdev->vq_index + i);
>+        rc |= vhost_virtqueue_stop(hdev,
>+                                   vdev,
>+                                   hdev->vqs + i,
>+                                   hdev->vq_index + i);

Also other function can fails, should we consider also them?
(e.g. vhost_dev_start, vhost_dev_set_vring_enable, etc.)

If not, why?

>     }
>     if (hdev->vhost_ops->vhost_reset_status) {
>         hdev->vhost_ops->vhost_reset_status(hdev);
>@@ -2177,6 +2179,7 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>     hdev->started = false;
>     vdev->vhost_started = false;
>     hdev->vdev = NULL;
>+    return rc;
> }
>
> int vhost_net_set_backend(struct vhost_dev *hdev,
>diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>index a9469d50bc..fd96ec9c39 100644
>--- a/include/hw/virtio/vhost.h
>+++ b/include/hw/virtio/vhost.h
>@@ -232,8 +232,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>  * Stop the vhost device. After the device is stopped the notifiers
>  * can be disabled (@vhost_dev_disable_notifiers) and the device can
>  * be torn down (@vhost_dev_cleanup).
>+ *
>+ * Return: 0 on success, != 0 on error when stopping dev.
>  */
>-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>
> /**
>  * DOC: vhost device configuration handling
>@@ -333,8 +335,8 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
>
> int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
>                           struct vhost_virtqueue *vq, unsigned idx);
>-void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
>-                          struct vhost_virtqueue *vq, unsigned idx);
>+int vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
>+                         struct vhost_virtqueue *vq, unsigned idx);
>
> void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
> void vhost_dev_free_inflight(struct vhost_inflight *inflight);
>-- 
>2.48.1
>

Re: [PATCH v2 2/3] vhost: return failure if stop virtqueue failed in vhost_dev_stop

Posted by Haoqian He 1 week, 6 days ago

> 2025年3月19日 23:11，Stefano Garzarella <sgarzare@redhat.com> 写道：
> 
> On Fri, Mar 14, 2025 at 06:15:33AM -0400, Haoqian He wrote:
>> The backend maybe crash when vhost_dev_stop and GET_VRING_BASE
>> would fail, we can return failure to indicate the connection
>> with the backend is broken.
>> 
>> Signed-off-by: Haoqian He <haoqian.he@smartx.com>
>> ---
>> hw/virtio/vhost.c         | 27 +++++++++++++++------------
>> include/hw/virtio/vhost.h |  8 +++++---
>> 2 files changed, 20 insertions(+), 15 deletions(-)
>> 
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index 6aa72fd434..c82bbbe4cc 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -1368,23 +1368,23 @@ fail_alloc_desc:
>>    return r;
>> }
>> 
>> -void vhost_virtqueue_stop(struct vhost_dev *dev,
>> -                          struct VirtIODevice *vdev,
>> -                          struct vhost_virtqueue *vq,
>> -                          unsigned idx)
>> +int vhost_virtqueue_stop(struct vhost_dev *dev,
>> +                         struct VirtIODevice *vdev,
>> +                         struct vhost_virtqueue *vq,
>> +                         unsigned idx)
>> {
>>    int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
>>    struct vhost_vring_state state = {
>>        .index = vhost_vq_index,
>>    };
>> -    int r;
>> +    int r = 0;
>> 
>>    if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
>>        /* Don't stop the virtqueue which might have not been started */
>> -        return;
>> +        return 0;
>>    }
>> 
>> -    r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
>> +    r |= dev->vhost_ops->vhost_get_vring_base(dev, &state);
> 
> We can avoid this and also initialize r to 0.

Here we need to do `vhost_virtqueue_stop` for each vq.

> 
>>    if (r < 0) {
>>        VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
>>        /* Connection to the backend is broken, so let's sync internal
>> @@ -1412,6 +1412,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
>>                       0, virtio_queue_get_avail_size(vdev, idx));
>>    vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
>>                       0, virtio_queue_get_desc_size(vdev, idx));
>> +    return r;
>> }
>> 
>> static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
>> @@ -2136,9 +2137,10 @@ fail_features:
>> }
>> 
>> /* Host notifiers must be enabled at this point. */
>> -void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>> +int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>> {
>>    int i;
>> +    int rc = 0;
>> 
>>    /* should only be called after backend is connected */
>>    assert(hdev->vhost_ops);
>> @@ -2157,10 +2159,10 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>>        vhost_dev_set_vring_enable(hdev, false);
>>    }
>>    for (i = 0; i < hdev->nvqs; ++i) {
>> -        vhost_virtqueue_stop(hdev,
>> -                             vdev,
>> -                             hdev->vqs + i,
>> -                             hdev->vq_index + i);
>> +        rc |= vhost_virtqueue_stop(hdev,
>> +                                   vdev,
>> +                                   hdev->vqs + i,
>> +                                   hdev->vq_index + i);
> 
> Also other function can fails, should we consider also them?
> (e.g. 	, vhost_dev_set_vring_enable, etc.)
> 
> If not, why?

Since we only want to know the return value of callback when the stopping device
live migration, there is no need to catch the failure of `vhost_dev_start`.

We can also catch the failure of `vhost_dev_set_vring_enable`, because
`vhost_dev_set_vring_enable` will also fail if qemu lost connection with the
the backend, but I need to test it.


> 
>>    }
>>    if (hdev->vhost_ops->vhost_reset_status) {
>>        hdev->vhost_ops->vhost_reset_status(hdev);
>> @@ -2177,6 +2179,7 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>>    hdev->started = false;
>>    vdev->vhost_started = false;
>>    hdev->vdev = NULL;
>> +    return rc;
>> }
>> 
>> int vhost_net_set_backend(struct vhost_dev *hdev,
>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>> index a9469d50bc..fd96ec9c39 100644
>> --- a/include/hw/virtio/vhost.h
>> +++ b/include/hw/virtio/vhost.h
>> @@ -232,8 +232,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>> * Stop the vhost device. After the device is stopped the notifiers
>> * can be disabled (@vhost_dev_disable_notifiers) and the device can
>> * be torn down (@vhost_dev_cleanup).
>> + *
>> + * Return: 0 on success, != 0 on error when stopping dev.
>> */
>> -void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>> +int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>> 
>> /**
>> * DOC: vhost device configuration handling
>> @@ -333,8 +335,8 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
>> 
>> int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
>>                          struct vhost_virtqueue *vq, unsigned idx);
>> -void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
>> -                          struct vhost_virtqueue *vq, unsigned idx);
>> +int vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
>> +                         struct vhost_virtqueue *vq, unsigned idx);
>> 
>> void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
>> void vhost_dev_free_inflight(struct vhost_inflight *inflight);
>> -- 
>> 2.48.1

Re: [PATCH v2 2/3] vhost: return failure if stop virtqueue failed in vhost_dev_stop

Posted by Stefano Garzarella 1 week, 1 day ago

On Thu, Mar 20, 2025 at 08:21:25PM +0800, Haoqian He wrote:
>
>> 2025年3月19日 23:11，Stefano Garzarella <sgarzare@redhat.com> 写道：
>>
>> On Fri, Mar 14, 2025 at 06:15:33AM -0400, Haoqian He wrote:
>>> The backend maybe crash when vhost_dev_stop and GET_VRING_BASE
>>> would fail, we can return failure to indicate the connection
>>> with the backend is broken.
>>>
>>> Signed-off-by: Haoqian He <haoqian.he@smartx.com>
>>> ---
>>> hw/virtio/vhost.c         | 27 +++++++++++++++------------
>>> include/hw/virtio/vhost.h |  8 +++++---
>>> 2 files changed, 20 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>> index 6aa72fd434..c82bbbe4cc 100644
>>> --- a/hw/virtio/vhost.c
>>> +++ b/hw/virtio/vhost.c
>>> @@ -1368,23 +1368,23 @@ fail_alloc_desc:
>>>    return r;
>>> }
>>>
>>> -void vhost_virtqueue_stop(struct vhost_dev *dev,
>>> -                          struct VirtIODevice *vdev,
>>> -                          struct vhost_virtqueue *vq,
>>> -                          unsigned idx)
>>> +int vhost_virtqueue_stop(struct vhost_dev *dev,
>>> +                         struct VirtIODevice *vdev,
>>> +                         struct vhost_virtqueue *vq,
>>> +                         unsigned idx)
>>> {
>>>    int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
>>>    struct vhost_vring_state state = {
>>>        .index = vhost_vq_index,
>>>    };
>>> -    int r;
>>> +    int r = 0;
>>>
>>>    if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
>>>        /* Don't stop the virtqueue which might have not been started */
>>> -        return;
>>> +        return 0;
>>>    }
>>>
>>> -    r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
>>> +    r |= dev->vhost_ops->vhost_get_vring_base(dev, &state);
>>
>> We can avoid this and also initialize r to 0.
>
>Here we need to do `vhost_virtqueue_stop` for each vq.

Sorry, my question is what's the point of initializing r to 0 and then 
putting it in or here with the result of vhost_get_vring_base?
Can't we leave it as before and initialize it directly here?

>
>>
>>>    if (r < 0) {
>>>        VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
>>>        /* Connection to the backend is broken, so let's sync internal
>>> @@ -1412,6 +1412,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
>>>                       0, virtio_queue_get_avail_size(vdev, idx));
>>>    vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
>>>                       0, virtio_queue_get_desc_size(vdev, idx));
>>> +    return r;
>>> }
>>>
>>> static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
>>> @@ -2136,9 +2137,10 @@ fail_features:
>>> }
>>>
>>> /* Host notifiers must be enabled at this point. */
>>> -void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>>> +int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>>> {
>>>    int i;
>>> +    int rc = 0;
>>>
>>>    /* should only be called after backend is connected */
>>>    assert(hdev->vhost_ops);
>>> @@ -2157,10 +2159,10 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>>>        vhost_dev_set_vring_enable(hdev, false);
>>>    }
>>>    for (i = 0; i < hdev->nvqs; ++i) {
>>> -        vhost_virtqueue_stop(hdev,
>>> -                             vdev,
>>> -                             hdev->vqs + i,
>>> -                             hdev->vq_index + i);
>>> +        rc |= vhost_virtqueue_stop(hdev,
>>> +                                   vdev,
>>> +                                   hdev->vqs + i,
>>> +                                   hdev->vq_index + i);
>>
>> Also other function can fails, should we consider also them?
>> (e.g. 	, vhost_dev_set_vring_enable, etc.)
>>
>> If not, why?
>
>Since we only want to know the return value of callback when the 
>stopping device
>live migration, there is no need to catch the failure of `vhost_dev_start`.

Please add that in the commit message, and maybe also in a comment here.

>
>We can also catch the failure of `vhost_dev_set_vring_enable`, because
>`vhost_dev_set_vring_enable` will also fail if qemu lost connection with the
>the backend, but I need to test it.
>

Capturing failures of only some things is a little confusing to me, I 
think it needs to be better explained.

Thanks,
Stefano

>
>>
>>>    }
>>>    if (hdev->vhost_ops->vhost_reset_status) {
>>>        hdev->vhost_ops->vhost_reset_status(hdev);
>>> @@ -2177,6 +2179,7 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>>>    hdev->started = false;
>>>    vdev->vhost_started = false;
>>>    hdev->vdev = NULL;
>>> +    return rc;
>>> }
>>>
>>> int vhost_net_set_backend(struct vhost_dev *hdev,
>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>> index a9469d50bc..fd96ec9c39 100644
>>> --- a/include/hw/virtio/vhost.h
>>> +++ b/include/hw/virtio/vhost.h
>>> @@ -232,8 +232,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>>> * Stop the vhost device. After the device is stopped the notifiers
>>> * can be disabled (@vhost_dev_disable_notifiers) and the device can
>>> * be torn down (@vhost_dev_cleanup).
>>> + *
>>> + * Return: 0 on success, != 0 on error when stopping dev.
>>> */
>>> -void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>>> +int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>>>
>>> /**
>>> * DOC: vhost device configuration handling
>>> @@ -333,8 +335,8 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
>>>
>>> int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
>>>                          struct vhost_virtqueue *vq, unsigned idx);
>>> -void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
>>> -                          struct vhost_virtqueue *vq, unsigned idx);
>>> +int vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
>>> +                         struct vhost_virtqueue *vq, unsigned idx);
>>>
>>> void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
>>> void vhost_dev_free_inflight(struct vhost_inflight *inflight);
>>> --
>>> 2.48.1
>
>

Re: [PATCH v2 2/3] vhost: return failure if stop virtqueue failed in vhost_dev_stop

Posted by Haoqian He 1 week, 1 day ago

2025年3月24日 22:25，Stefano Garzarella <sgarzare@redhat.com> 写道：

On Thu, Mar 20, 2025 at 08:21:25PM +0800, Haoqian He wrote:


2025年3月19日 23:11，Stefano Garzarella <sgarzare@redhat.com> 写道：

On Fri, Mar 14, 2025 at 06:15:33AM -0400, Haoqian He wrote:

The backend maybe crash when vhost_dev_stop and GET_VRING_BASE
would fail, we can return failure to indicate the connection
with the backend is broken.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
hw/virtio/vhost.c         | 27 +++++++++++++++------------
include/hw/virtio/vhost.h |  8 +++++---
2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 6aa72fd434..c82bbbe4cc 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1368,23 +1368,23 @@ fail_alloc_desc:
return r;
}

-void vhost_virtqueue_stop(struct vhost_dev *dev,
-                          struct VirtIODevice *vdev,
-                          struct vhost_virtqueue *vq,
-                          unsigned idx)
+int vhost_virtqueue_stop(struct vhost_dev *dev,
+                         struct VirtIODevice *vdev,
+                         struct vhost_virtqueue *vq,
+                         unsigned idx)
{
int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
struct vhost_vring_state state = {
 .index = vhost_vq_index,
};
-    int r;
+    int r = 0;

if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
 /* Don't stop the virtqueue which might have not been started */
-        return;
+        return 0;
}

-    r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
+    r |= dev->vhost_ops->vhost_get_vring_base(dev, &state);


We can avoid this and also initialize r to 0.


Here we need to do `vhost_virtqueue_stop` for each vq.


Sorry, my question is what's the point of initializing r to 0 and then
putting it in or here with the result of vhost_get_vring_base?
Can't we leave it as before and initialize it directly here?


Acked.




if (r < 0) {
 VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
 /* Connection to the backend is broken, so let's sync internal
@@ -1412,6 +1412,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
                0, virtio_queue_get_avail_size(vdev, idx));
vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
                0, virtio_queue_get_desc_size(vdev, idx));
+    return r;
}

static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
@@ -2136,9 +2137,10 @@ fail_features:
}

/* Host notifiers must be enabled at this point. */
-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
{
int i;
+    int rc = 0;

/* should only be called after backend is connected */
assert(hdev->vhost_ops);
@@ -2157,10 +2159,10 @@ void vhost_dev_stop(struct vhost_dev *hdev,
VirtIODevice *vdev, bool vrings)
 vhost_dev_set_vring_enable(hdev, false);
}
for (i = 0; i < hdev->nvqs; ++i) {
-        vhost_virtqueue_stop(hdev,
-                             vdev,
-                             hdev->vqs + i,
-                             hdev->vq_index + i);
+        rc |= vhost_virtqueue_stop(hdev,
+                                   vdev,
+                                   hdev->vqs + i,
+                                   hdev->vq_index + i);


Also other function can fails, should we consider also them?
(e.g. , vhost_dev_set_vring_enable, etc.)

If not, why?


Since we only want to know the return value of callback when the stopping device
live migration, there is no need to catch the failure of `vhost_dev_start`.


Please add that in the commit message, and maybe also in a comment here.


We can also catch the failure of `vhost_dev_set_vring_enable`, because
`vhost_dev_set_vring_enable` will also fail if qemu lost connection with the
the backend, but I need to test it.


Capturing failures of only some things is a little confusing to me, I
think it needs to be better explained.


Only capture vhost_virtqueue_stop's error based on the following considerations:

1. `vhost_dev_start` always return 0 when stopping, so there is no need to
capture the error.

2. For `vhost_dev_set_vring_enable`/`vhost_reset_status`, it is necessary to
satisfy that virtio has set certain feature bits (e.g.
VHOST_USER_PROTOCOL_F_STATUS).
Relatively speaking, `vhost_virtqueue_stop` is more universal, if stop
virtiqueue
fails, we will abort the live migration.

3. Even if we capture the error of any function, we cannot return directly, and
we still need to execute the rest of `vhost_dev_stop`.

Thanks,
Haoqian



Thanks,
Stefano



}
if (hdev->vhost_ops->vhost_reset_status) {
 hdev->vhost_ops->vhost_reset_status(hdev);
@@ -2177,6 +2179,7 @@ void vhost_dev_stop(struct vhost_dev *hdev,
VirtIODevice *vdev, bool vrings)
hdev->started = false;
vdev->vhost_started = false;
hdev->vdev = NULL;
+    return rc;
}

int vhost_net_set_backend(struct vhost_dev *hdev,
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index a9469d50bc..fd96ec9c39 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -232,8 +232,10 @@ int vhost_dev_start(struct vhost_dev *hdev,
VirtIODevice *vdev, bool vrings);
* Stop the vhost device. After the device is stopped the notifiers
* can be disabled (@vhost_dev_disable_notifiers) and the device can
* be torn down (@vhost_dev_cleanup).
+ *
+ * Return: 0 on success, != 0 on error when stopping dev.
*/
-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);

/**
* DOC: vhost device configuration handling
@@ -333,8 +335,8 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev,
uint64_t iova, int write);

int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
                   struct vhost_virtqueue *vq, unsigned idx);
-void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
-                          struct vhost_virtqueue *vq, unsigned idx);
+int vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
+                         struct vhost_virtqueue *vq, unsigned idx);

void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
void vhost_dev_free_inflight(struct vhost_inflight *inflight);
--
2.48.1

Re: [PATCH v2 2/3] vhost: return failure if stop virtqueue failed in vhost_dev_stop

Posted by Stefano Garzarella 1 week, 1 day ago

On Tue, Mar 25, 2025 at 04:36:53PM +0800, Haoqian He wrote:
>2025年3月24日 22:25，Stefano Garzarella <sgarzare@redhat.com> 写道：
>
>On Thu, Mar 20, 2025 at 08:21:25PM +0800, Haoqian He wrote:
>
>
>2025年3月19日 23:11，Stefano Garzarella <sgarzare@redhat.com> 写道：
>
>On Fri, Mar 14, 2025 at 06:15:33AM -0400, Haoqian He wrote:
>
>The backend maybe crash when vhost_dev_stop and GET_VRING_BASE
>would fail, we can return failure to indicate the connection
>with the backend is broken.
>
>Signed-off-by: Haoqian He <haoqian.he@smartx.com>
>---
>hw/virtio/vhost.c         | 27 +++++++++++++++------------
>include/hw/virtio/vhost.h |  8 +++++---
>2 files changed, 20 insertions(+), 15 deletions(-)
>
>diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>index 6aa72fd434..c82bbbe4cc 100644
>--- a/hw/virtio/vhost.c
>+++ b/hw/virtio/vhost.c
>@@ -1368,23 +1368,23 @@ fail_alloc_desc:
>return r;
>}
>
>-void vhost_virtqueue_stop(struct vhost_dev *dev,
>-                          struct VirtIODevice *vdev,
>-                          struct vhost_virtqueue *vq,
>-                          unsigned idx)
>+int vhost_virtqueue_stop(struct vhost_dev *dev,
>+                         struct VirtIODevice *vdev,
>+                         struct vhost_virtqueue *vq,
>+                         unsigned idx)
>{
>int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
>struct vhost_vring_state state = {
> .index = vhost_vq_index,
>};
>-    int r;
>+    int r = 0;
>
>if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
> /* Don't stop the virtqueue which might have not been started */
>-        return;
>+        return 0;
>}
>
>-    r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
>+    r |= dev->vhost_ops->vhost_get_vring_base(dev, &state);
>
>
>We can avoid this and also initialize r to 0.
>
>
>Here we need to do `vhost_virtqueue_stop` for each vq.
>
>
>Sorry, my question is what's the point of initializing r to 0 and then
>putting it in or here with the result of vhost_get_vring_base?
>Can't we leave it as before and initialize it directly here?
>
>
>Acked.
>
>
>
>
>if (r < 0) {
> VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
> /* Connection to the backend is broken, so let's sync internal
>@@ -1412,6 +1412,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
>                0, virtio_queue_get_avail_size(vdev, idx));
>vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
>                0, virtio_queue_get_desc_size(vdev, idx));
>+    return r;
>}
>
>static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
>@@ -2136,9 +2137,10 @@ fail_features:
>}
>
>/* Host notifiers must be enabled at this point. */
>-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>{
>int i;
>+    int rc = 0;
>
>/* should only be called after backend is connected */
>assert(hdev->vhost_ops);
>@@ -2157,10 +2159,10 @@ void vhost_dev_stop(struct vhost_dev *hdev,
>VirtIODevice *vdev, bool vrings)
> vhost_dev_set_vring_enable(hdev, false);
>}
>for (i = 0; i < hdev->nvqs; ++i) {
>-        vhost_virtqueue_stop(hdev,
>-                             vdev,
>-                             hdev->vqs + i,
>-                             hdev->vq_index + i);
>+        rc |= vhost_virtqueue_stop(hdev,
>+                                   vdev,
>+                                   hdev->vqs + i,
>+                                   hdev->vq_index + i);
>
>
>Also other function can fails, should we consider also them?
>(e.g. , vhost_dev_set_vring_enable, etc.)
>
>If not, why?
>
>
>Since we only want to know the return value of callback when the stopping device
>live migration, there is no need to catch the failure of `vhost_dev_start`.
>
>
>Please add that in the commit message, and maybe also in a comment here.
>
>
>We can also catch the failure of `vhost_dev_set_vring_enable`, because
>`vhost_dev_set_vring_enable` will also fail if qemu lost connection with the
>the backend, but I need to test it.
>
>
>Capturing failures of only some things is a little confusing to me, I
>think it needs to be better explained.
>
>
>Only capture vhost_virtqueue_stop's error based on the following considerations:
>
>1. `vhost_dev_start` always return 0 when stopping, so there is no need to
>capture the error.
>
>2. For `vhost_dev_set_vring_enable`/`vhost_reset_status`, it is necessary to
>satisfy that virtio has set certain feature bits (e.g.
>VHOST_USER_PROTOCOL_F_STATUS).
>Relatively speaking, `vhost_virtqueue_stop` is more universal, if stop
>virtiqueue
>fails, we will abort the live migration.
>
>3. Even if we capture the error of any function, we cannot return directly, and
>we still need to execute the rest of `vhost_dev_stop`.
>

This is exactly the information that a commit message should contain, so 
I suggest you add this kind of information to each commit to simplify 
both review, but also in the future possible bisection for issues.

Thanks,
Stefano

>}
>if (hdev->vhost_ops->vhost_reset_status) {
> hdev->vhost_ops->vhost_reset_status(hdev);
>@@ -2177,6 +2179,7 @@ void vhost_dev_stop(struct vhost_dev *hdev,
>VirtIODevice *vdev, bool vrings)
>hdev->started = false;
>vdev->vhost_started = false;
>hdev->vdev = NULL;
>+    return rc;
>}
>
>int vhost_net_set_backend(struct vhost_dev *hdev,
>diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>index a9469d50bc..fd96ec9c39 100644
>--- a/include/hw/virtio/vhost.h
>+++ b/include/hw/virtio/vhost.h
>@@ -232,8 +232,10 @@ int vhost_dev_start(struct vhost_dev *hdev,
>VirtIODevice *vdev, bool vrings);
>* Stop the vhost device. After the device is stopped the notifiers
>* can be disabled (@vhost_dev_disable_notifiers) and the device can
>* be torn down (@vhost_dev_cleanup).
>+ *
>+ * Return: 0 on success, != 0 on error when stopping dev.
>*/
>-void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>+int vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
>
>/**
>* DOC: vhost device configuration handling
>@@ -333,8 +335,8 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev,
>uint64_t iova, int write);
>
>int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
>                   struct vhost_virtqueue *vq, unsigned idx);
>-void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
>-                          struct vhost_virtqueue *vq, unsigned idx);
>+int vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev,
>+                         struct vhost_virtqueue *vq, unsigned idx);
>
>void vhost_dev_reset_inflight(struct vhost_inflight *inflight);
>void vhost_dev_free_inflight(struct vhost_inflight *inflight);
>--
>2.48.1
>

[PATCH v2 3/3] vhost-user: return failure if backend crash when live migration

Posted by Haoqian He 2 weeks, 5 days ago

Live migration should be terminated if the backend crashes before
the migration completes.

Since the vhost device will be stopped when VM is stopped before
the end of the live migration, current implementation if vhost
backend died, vhost device's set_status() will not return failure,
live migration won't perceive the disconnection between qemu and
vhost backend, inflight io would be submitted in migration target
host, leading to IO error.

To fix this issue:
1. Add set_status_ext() which has return value for
VirtioDeviceClass and vhost-user-blk/scsi use the _ext version.

2. In set_status_ext(), return failure if the flag `connected`
is false or vhost_dev_stop return failure, which means qemu lost
connection with backend.

Hence migration_completion() will process failure, terminate
migration and restore VM.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
---
 hw/block/vhost-user-blk.c             | 29 +++++++++++++++------------
 hw/scsi/vhost-scsi-common.c           | 13 ++++++------
 hw/scsi/vhost-user-scsi.c             | 20 ++++++++++--------
 hw/virtio/virtio.c                    | 20 +++++++++++++-----
 include/hw/virtio/vhost-scsi-common.h |  2 +-
 include/hw/virtio/virtio.h            |  1 +
 6 files changed, 52 insertions(+), 33 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index ae42327cf8..4865786c54 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -204,7 +204,7 @@ err_host_notifiers:
     return ret;
 }
 
-static void vhost_user_blk_stop(VirtIODevice *vdev)
+static int vhost_user_blk_stop(VirtIODevice *vdev)
 {
     VHostUserBlk *s = VHOST_USER_BLK(vdev);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
@@ -212,26 +212,26 @@ static void vhost_user_blk_stop(VirtIODevice *vdev)
     int ret;
 
     if (!s->started_vu) {
-        return;
+        return 0;
     }
     s->started_vu = false;
 
     if (!k->set_guest_notifiers) {
-        return;
+        return 0;
     }
 
-    vhost_dev_stop(&s->dev, vdev, true);
+    ret = vhost_dev_stop(&s->dev, vdev, true);
 
-    ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
-    if (ret < 0) {
+    if (k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false) < 0) {
         error_report("vhost guest notifier cleanup failed: %d", ret);
-        return;
+        return -1;
     }
 
     vhost_dev_disable_notifiers(&s->dev, vdev);
+    return ret;
 }
 
-static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserBlk *s = VHOST_USER_BLK(vdev);
     bool should_start = virtio_device_should_start(vdev, status);
@@ -239,11 +239,11 @@ static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
     int ret;
 
     if (!s->connected) {
-        return;
+        return -1;
     }
 
     if (vhost_dev_is_started(&s->dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
@@ -253,9 +253,12 @@ static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
             qemu_chr_fe_disconnect(&s->chardev);
         }
     } else {
-        vhost_user_blk_stop(vdev);
+        ret = vhost_user_blk_stop(vdev);
+        if (ret < 0) {
+            return ret;
+        }
     }
-
+    return 0;
 }
 
 static uint64_t vhost_user_blk_get_features(VirtIODevice *vdev,
@@ -597,7 +600,7 @@ static void vhost_user_blk_class_init(ObjectClass *klass, void *data)
     vdc->get_config = vhost_user_blk_update_config;
     vdc->set_config = vhost_user_blk_set_config;
     vdc->get_features = vhost_user_blk_get_features;
-    vdc->set_status = vhost_user_blk_set_status;
+    vdc->set_status_ext = vhost_user_blk_set_status;
     vdc->reset = vhost_user_blk_reset;
     vdc->get_vhost = vhost_user_blk_get_vhost;
 }
diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
index 4c8637045d..43525ba46d 100644
--- a/hw/scsi/vhost-scsi-common.c
+++ b/hw/scsi/vhost-scsi-common.c
@@ -101,24 +101,25 @@ err_host_notifiers:
     return ret;
 }
 
-void vhost_scsi_common_stop(VHostSCSICommon *vsc)
+int vhost_scsi_common_stop(VHostSCSICommon *vsc)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
     int ret = 0;
 
-    vhost_dev_stop(&vsc->dev, vdev, true);
+    ret = vhost_dev_stop(&vsc->dev, vdev, true);
 
     if (k->set_guest_notifiers) {
-        ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, false);
-        if (ret < 0) {
-                error_report("vhost guest notifier cleanup failed: %d", ret);
+        int r = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, false);
+        if (r < 0) {
+            error_report("vhost guest notifier cleanup failed: %d", ret);
+            return r;
         }
     }
-    assert(ret >= 0);
 
     vhost_dev_disable_notifiers(&vsc->dev, vdev);
+    return ret;
 }
 
 uint64_t vhost_scsi_common_get_features(VirtIODevice *vdev, uint64_t features,
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index adb41b9816..8e7efc38f2 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -52,19 +52,19 @@ static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
     return ret;
 }
 
-static void vhost_user_scsi_stop(VHostUserSCSI *s)
+static int vhost_user_scsi_stop(VHostUserSCSI *s)
 {
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
 
     if (!s->started_vu) {
-        return;
+        return 0;
     }
     s->started_vu = false;
 
-    vhost_scsi_common_stop(vsc);
+    return vhost_scsi_common_stop(vsc);
 }
 
-static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
+static int vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
 {
     VHostUserSCSI *s = (VHostUserSCSI *)vdev;
     DeviceState *dev = DEVICE(vdev);
@@ -75,11 +75,11 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
     int ret;
 
     if (!s->connected) {
-        return;
+        return -1;
     }
 
     if (vhost_dev_is_started(&vsc->dev) == should_start) {
-        return;
+        return 0;
     }
 
     if (should_start) {
@@ -91,8 +91,12 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
             qemu_chr_fe_disconnect(&vs->conf.chardev);
         }
     } else {
-        vhost_user_scsi_stop(s);
+        ret = vhost_user_scsi_stop(s);
+        if (ret) {
+            return ret;
+        }
     }
+    return 0;
 }
 
 static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
@@ -399,7 +403,7 @@ static void vhost_user_scsi_class_init(ObjectClass *klass, void *data)
     vdc->unrealize = vhost_user_scsi_unrealize;
     vdc->get_features = vhost_scsi_common_get_features;
     vdc->set_config = vhost_scsi_common_set_config;
-    vdc->set_status = vhost_user_scsi_set_status;
+    vdc->set_status_ext = vhost_user_scsi_set_status;
     fwc->get_dev_path = vhost_scsi_common_get_fw_dev_path;
     vdc->reset = vhost_user_scsi_reset;
     vdc->get_vhost = vhost_user_scsi_get_vhost;
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 5e8d4cab53..fff7cdb35d 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2221,12 +2221,12 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val)
 {
     VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
     trace_virtio_set_status(vdev, val);
+    int ret = 0;
 
     if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
         if (!(vdev->status & VIRTIO_CONFIG_S_FEATURES_OK) &&
             val & VIRTIO_CONFIG_S_FEATURES_OK) {
-            int ret = virtio_validate_features(vdev);
-
+            ret = virtio_validate_features(vdev);
             if (ret) {
                 return ret;
             }
@@ -2238,12 +2238,18 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val)
         virtio_set_started(vdev, val & VIRTIO_CONFIG_S_DRIVER_OK);
     }
 
-    if (k->set_status) {
+    if (k->set_status_ext) {
+        ret = k->set_status_ext(vdev, val);
+        if (ret) {
+            qemu_log("set %s status to %d failed, old status: %d\n",
+                     vdev->name, val, vdev->status);
+        }
+    } else if (k->set_status) {
         k->set_status(vdev, val);
     }
     vdev->status = val;
 
-    return 0;
+    return ret;
 }
 
 static enum virtio_device_endian virtio_default_endian(void)
@@ -3436,7 +3442,11 @@ static int virtio_vmstate_change(void *opaque, bool running, RunState state)
     }
 
     if (!backend_run) {
-        virtio_set_status(vdev, vdev->status);
+        // the return value was used for stopping VM during migration
+        int ret = virtio_set_status(vdev, vdev->status);
+        if (ret) {
+            return ret;
+        }
     }
     return 0;
 }
diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
index c5d2c09455..d54d9c916f 100644
--- a/include/hw/virtio/vhost-scsi-common.h
+++ b/include/hw/virtio/vhost-scsi-common.h
@@ -40,7 +40,7 @@ struct VHostSCSICommon {
 };
 
 int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
-void vhost_scsi_common_stop(VHostSCSICommon *vsc);
+int vhost_scsi_common_stop(VHostSCSICommon *vsc);
 char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
                                         DeviceState *dev);
 void vhost_scsi_common_set_config(VirtIODevice *vdev, const uint8_t *config);
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 6386910280..c99d56f519 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -187,6 +187,7 @@ struct VirtioDeviceClass {
     void (*set_config)(VirtIODevice *vdev, const uint8_t *config);
     void (*reset)(VirtIODevice *vdev);
     void (*set_status)(VirtIODevice *vdev, uint8_t val);
+    int (*set_status_ext)(VirtIODevice *vdev, uint8_t val);
     /* Device must validate queue_index.  */
     void (*queue_reset)(VirtIODevice *vdev, uint32_t queue_index);
     /* Device must validate queue_index.  */
-- 
2.48.1