[PATCH v3 0/3] migration: propagate vTPM errors using Error objects

Arun Menon posted 3 patches 4 months, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20250702-propagate._5Ftpm._5Ferror-v3-0-986d94540528@redhat.com
Maintainers: Stefan Berger <stefanb@linux.vnet.ibm.com>, "Michael S. Tsirkin" <mst@redhat.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Cornelia Huck <cohuck@redhat.com>, Halil Pasic <pasic@linux.ibm.com>, Eric Farman <farman@linux.ibm.com>, Thomas Huth <thuth@redhat.com>, Christian Borntraeger <borntraeger@linux.ibm.com>, Matthew Rosato <mjrosato@linux.ibm.com>, Richard Henderson <richard.henderson@linaro.org>, David Hildenbrand <david@redhat.com>, Ilya Leoshkevich <iii@linux.ibm.com>, Nicholas Piggin <npiggin@gmail.com>, Daniel Henrique Barboza <danielhb413@gmail.com>, Harsh Prateek Bora <harshpb@linux.ibm.com>, Paolo Bonzini <pbonzini@redhat.com>, Fam Zheng <fam@euphon.net>, Alex Williamson <alex.williamson@redhat.com>, "Cédric Le Goater" <clg@redhat.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Hailiang Zhang <zhanghailiang@xfusion.com>, Steve Sistare <steven.sistare@oracle.com>
There is a newer version of this series
backends/tpm/tpm_emulator.c |  39 +++++++------
hw/display/virtio-gpu.c     |   2 +-
hw/pci/pci.c                |   2 +-
hw/s390x/virtio-ccw.c       |   2 +-
hw/scsi/spapr_vscsi.c       |   2 +-
hw/vfio/pci.c               |   2 +-
hw/virtio/virtio-mmio.c     |   2 +-
hw/virtio/virtio-pci.c      |   2 +-
hw/virtio/virtio.c          |   4 +-
include/migration/vmstate.h |   3 +-
migration/colo.c            |  13 +++--
migration/cpr.c             |   2 +-
migration/migration.c       |  19 ++++--
migration/savevm.c          | 137 +++++++++++++++++++++++++++-----------------
migration/savevm.h          |   7 ++-
migration/vmstate-types.c   |  10 ++--
migration/vmstate.c         |  44 ++++++++------
tests/unit/test-vmstate.c   |  18 +++---
18 files changed, 182 insertions(+), 128 deletions(-)
[PATCH v3 0/3] migration: propagate vTPM errors using Error objects
Posted by Arun Menon 4 months, 2 weeks ago
Currently, when a migration of a VM with an encrypted vTPM
fails on the destination host (e.g., due to a mismatch in secret values),
the error message displayed on the source host is generic and unhelpful.

For example, a typical error looks like this:
"operation failed: job 'migration out' failed: Sibling indicated error 1.
operation failed: job 'migration in' failed: load of migration failed:
Input/output error"

This message does not provide any specific indication of a vTPM failure.
Such generic errors are logged using error_report(), which prints to
the console/monitor but does not make the detailed error accessible via
the QMP query-migrate command.

This series addresses the issue, by ensuring that specific TPM error
messages are propagated via the QEMU Error object.
To make this possible,
- A set of functions in the call stack is changed
  to incorporate an Error object as an additional parameter.
- Also, the TPM backend makes use of a new hook called post_load_with_error()
  that explicitly passes an Error object.

While this series focuses specifically on TPM error reporting during
live migration, it lays the groundwork for broader improvements.
Most methods in savevm.c that previously returned an integer now capture
errors in the Error object, enabling other modules to adopt the
post_load_with_error hook in the future.

One such change previously attempted:
https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg01727.html

The series does not necessarily have to be applied in 1 go. Each patch
can be compiled and tested separately.

Resolves: https://issues.redhat.com/browse/RHEL-82826

Signed-off-by: Arun Menon <armenon@redhat.com>
---
Changes in v3:
- Split the 2nd patch into 2. Introducing post_load_with_error() hook
  has been separated from using it in the backends TPM module. This is
  so that it can be acknowledged.
- Link to v2: https://lore.kernel.org/qemu-devel/20250627-propagate_tpm_error-v2-0-85990c89da29@redhat.com

Changes in v2:
- Combine the first two changes into one, focusing on passing the
  Error object (errp) consistently through functions involved in
  loading the VM's state. Other functions are not yet changed.
- As suggested in the review comment, add null checks for errp
  before adding error messages, preventing crashes.
  We also now correctly set errors when post-copy migration fails.
- In process_incoming_migration_co(), switch to error_prepend
  instead of error_setg. This means we now null-check local_err in
  the "fail" section before using it, preventing dereferencing issues.
- Link to v1: https://lore.kernel.org/qemu-devel/20250624-propagate_tpm_error-v1-0-2171487a593d@redhat.com

---
Arun Menon (3):
      migration: Pass Error object errp into vm state loading functions
      migration: Introduce a post_load_with_error hook
      backends/tpm: Propagate vTPM error on migration failure

 backends/tpm/tpm_emulator.c |  39 +++++++------
 hw/display/virtio-gpu.c     |   2 +-
 hw/pci/pci.c                |   2 +-
 hw/s390x/virtio-ccw.c       |   2 +-
 hw/scsi/spapr_vscsi.c       |   2 +-
 hw/vfio/pci.c               |   2 +-
 hw/virtio/virtio-mmio.c     |   2 +-
 hw/virtio/virtio-pci.c      |   2 +-
 hw/virtio/virtio.c          |   4 +-
 include/migration/vmstate.h |   3 +-
 migration/colo.c            |  13 +++--
 migration/cpr.c             |   2 +-
 migration/migration.c       |  19 ++++--
 migration/savevm.c          | 137 +++++++++++++++++++++++++++-----------------
 migration/savevm.h          |   7 ++-
 migration/vmstate-types.c   |  10 ++--
 migration/vmstate.c         |  44 ++++++++------
 tests/unit/test-vmstate.c   |  18 +++---
 18 files changed, 182 insertions(+), 128 deletions(-)
---
base-commit: 43ba160cb4bbb193560eb0d2d7decc4b5fc599fe
change-id: 20250624-propagate_tpm_error-bf4ae6c23d30

Best regards,
-- 
Arun Menon <armenon@redhat.com>