[PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors

Ani Sinha posted 34 patches 1 month, 3 weeks ago
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Eduardo Habkost <eduardo@habkost.net>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Yanan Wang <wangyanan55@huawei.com>, Zhao Liu <zhao1.liu@intel.com>, "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>, Richard Henderson <richard.henderson@linaro.org>, "Michael S. Tsirkin" <mst@redhat.com>, Bernhard Beschow <shentey@gmail.com>, Alex Williamson <alex@shazbot.org>, "Cédric Le Goater" <clg@redhat.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, Ani Sinha <anisinha@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com>, David Woodhouse <dwmw2@infradead.org>, Paul Durrant <paul@xen.org>
There is a newer version of this series
[PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Ani Sinha 1 month, 3 weeks ago
Normally the vfio pseudo device file descriptor lives for the life of the VM.
However, when the kvm VM file descriptor changes, a new file descriptor
for the pseudo device needs to be generated against the new kvm VM descriptor.
Other existing vfio descriptors needs to be reattached to the new pseudo device
descriptor. This change performs the above steps.

Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
---
 hw/vfio/helpers.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 92 insertions(+)

diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
index f68f8165d0..e2bedd15ec 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -116,6 +116,89 @@ bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
  * we'll re-use it should another vfio device be attached before then.
  */
 int vfio_kvm_device_fd = -1;
+
+/*
+ * Confidential virtual machines:
+ * During reset of confidential vms, the kvm vm file descriptor changes.
+ * In this case, the old vfio kvm file descriptor is
+ * closed and a new descriptor is created against the new kvm vm file
+ * descriptor.
+ */
+
+typedef struct VFIODeviceFd {
+    int fd;
+    QLIST_ENTRY(VFIODeviceFd) node;
+} VFIODeviceFd;
+
+static QLIST_HEAD(, VFIODeviceFd) vfio_device_fds =
+    QLIST_HEAD_INITIALIZER(vfio_device_fds);
+
+static void vfio_device_fd_list_add(int fd)
+{
+    VFIODeviceFd *file_fd;
+    file_fd = g_malloc0(sizeof(*file_fd));
+    file_fd->fd = fd;
+    QLIST_INSERT_HEAD(&vfio_device_fds, file_fd, node);
+}
+
+static void vfio_device_fd_list_remove(int fd)
+{
+    VFIODeviceFd *file_fd, *next;
+
+    QLIST_FOREACH_SAFE(file_fd, &vfio_device_fds, node, next) {
+        if (file_fd->fd == fd) {
+            QLIST_REMOVE(file_fd, node);
+            g_free(file_fd);
+            break;
+        }
+    }
+}
+
+static int vfio_device_fd_rebind(NotifierWithReturn *notifier, void *data,
+                                  Error **errp)
+{
+    VFIODeviceFd *file_fd;
+    int ret = 0;
+    struct kvm_device_attr attr = {
+        .group = KVM_DEV_VFIO_FILE,
+        .attr = KVM_DEV_VFIO_FILE_ADD,
+    };
+    struct kvm_create_device cd = {
+        .type = KVM_DEV_TYPE_VFIO,
+    };
+
+    /* we are not interested in pre vmfd change notification */
+    if (((VmfdChangeNotifier *)data)->pre) {
+        return 0;
+    }
+
+    if (kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd)) {
+        error_setg_errno(errp, errno, "Failed to create KVM VFIO device");
+        return -errno;
+    }
+
+    if (vfio_kvm_device_fd != -1) {
+        close(vfio_kvm_device_fd);
+    }
+
+    vfio_kvm_device_fd = cd.fd;
+
+    QLIST_FOREACH(file_fd, &vfio_device_fds, node) {
+        attr.addr = (uint64_t)(unsigned long)&file_fd->fd;
+        if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
+            error_setg_errno(errp, errno,
+                             "Failed to add fd %d to KVM VFIO device",
+                             file_fd->fd);
+            ret = -errno;
+        }
+    }
+    return ret;
+}
+
+static struct NotifierWithReturn vfio_vmfd_change_notifier = {
+    .notify = vfio_device_fd_rebind,
+};
+
 #endif
 
 void vfio_kvm_device_close(void)
@@ -153,6 +236,11 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
         }
 
         vfio_kvm_device_fd = cd.fd;
+        /*
+         * If the vm file descriptor changes, add a notifier so that we can
+         * re-create the vfio_kvm_device_fd.
+         */
+        kvm_vmfd_add_change_notifier(&vfio_vmfd_change_notifier);
     }
 
     if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
@@ -160,6 +248,8 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
                          fd);
         return -errno;
     }
+
+    vfio_device_fd_list_add(fd);
 #endif
     return 0;
 }
@@ -183,6 +273,8 @@ int vfio_kvm_device_del_fd(int fd, Error **errp)
                          "Failed to remove fd %d from KVM VFIO device", fd);
         return -errno;
     }
+
+    vfio_device_fd_list_remove(fd);
 #endif
     return 0;
 }
-- 
2.42.0


Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Cédric Le Goater 1 month, 3 weeks ago
On 2/18/26 12:42, Ani Sinha wrote:
> Normally the vfio pseudo device file descriptor lives for the life of the VM.
> However, when the kvm VM file descriptor changes, a new file descriptor
> for the pseudo device needs to be generated against the new kvm VM descriptor.
> Other existing vfio descriptors needs to be reattached to the new pseudo device
> descriptor. This change performs the above steps.
> 
> Tested-by: Cédric Le Goater <clg@redhat.com>

There is a regression since last version.


'reboot' from the guest and command 'system_reset' from the QEMU
monitor now generate these outputs:

   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
   ...

and QEMU exits after a while.



> Signed-off-by: Ani Sinha <anisinha@redhat.com>

Anyhow this patch looks good.

Reviewed-by: Cédric Le Goater <clg@redhat.com>

Thanks,

C.

> ---
>   hw/vfio/helpers.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 92 insertions(+)
> 
> diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
> index f68f8165d0..e2bedd15ec 100644
> --- a/hw/vfio/helpers.c
> +++ b/hw/vfio/helpers.c
> @@ -116,6 +116,89 @@ bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
>    * we'll re-use it should another vfio device be attached before then.
>    */
>   int vfio_kvm_device_fd = -1;
> +
> +/*
> + * Confidential virtual machines:
> + * During reset of confidential vms, the kvm vm file descriptor changes.
> + * In this case, the old vfio kvm file descriptor is
> + * closed and a new descriptor is created against the new kvm vm file
> + * descriptor.
> + */
> +
> +typedef struct VFIODeviceFd {
> +    int fd;
> +    QLIST_ENTRY(VFIODeviceFd) node;
> +} VFIODeviceFd;
> +
> +static QLIST_HEAD(, VFIODeviceFd) vfio_device_fds =
> +    QLIST_HEAD_INITIALIZER(vfio_device_fds);
> +
> +static void vfio_device_fd_list_add(int fd)
> +{
> +    VFIODeviceFd *file_fd;
> +    file_fd = g_malloc0(sizeof(*file_fd));
> +    file_fd->fd = fd;
> +    QLIST_INSERT_HEAD(&vfio_device_fds, file_fd, node);
> +}
> +
> +static void vfio_device_fd_list_remove(int fd)
> +{
> +    VFIODeviceFd *file_fd, *next;
> +
> +    QLIST_FOREACH_SAFE(file_fd, &vfio_device_fds, node, next) {
> +        if (file_fd->fd == fd) {
> +            QLIST_REMOVE(file_fd, node);
> +            g_free(file_fd);
> +            break;
> +        }
> +    }
> +}
> +
> +static int vfio_device_fd_rebind(NotifierWithReturn *notifier, void *data,
> +                                  Error **errp)
> +{
> +    VFIODeviceFd *file_fd;
> +    int ret = 0;
> +    struct kvm_device_attr attr = {
> +        .group = KVM_DEV_VFIO_FILE,
> +        .attr = KVM_DEV_VFIO_FILE_ADD,
> +    };
> +    struct kvm_create_device cd = {
> +        .type = KVM_DEV_TYPE_VFIO,
> +    };
> +
> +    /* we are not interested in pre vmfd change notification */
> +    if (((VmfdChangeNotifier *)data)->pre) {
> +        return 0;
> +    }
> +
> +    if (kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd)) {
> +        error_setg_errno(errp, errno, "Failed to create KVM VFIO device");
> +        return -errno;
> +    }
> +
> +    if (vfio_kvm_device_fd != -1) {
> +        close(vfio_kvm_device_fd);
> +    }
> +
> +    vfio_kvm_device_fd = cd.fd;
> +
> +    QLIST_FOREACH(file_fd, &vfio_device_fds, node) {
> +        attr.addr = (uint64_t)(unsigned long)&file_fd->fd;
> +        if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
> +            error_setg_errno(errp, errno,
> +                             "Failed to add fd %d to KVM VFIO device",
> +                             file_fd->fd);
> +            ret = -errno;
> +        }
> +    }
> +    return ret;
> +}
> +
> +static struct NotifierWithReturn vfio_vmfd_change_notifier = {
> +    .notify = vfio_device_fd_rebind,
> +};
> +
>   #endif
>   
>   void vfio_kvm_device_close(void)
> @@ -153,6 +236,11 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
>           }
>   
>           vfio_kvm_device_fd = cd.fd;
> +        /*
> +         * If the vm file descriptor changes, add a notifier so that we can
> +         * re-create the vfio_kvm_device_fd.
> +         */
> +        kvm_vmfd_add_change_notifier(&vfio_vmfd_change_notifier);
>       }
>   
>       if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
> @@ -160,6 +248,8 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
>                            fd);
>           return -errno;
>       }
> +
> +    vfio_device_fd_list_add(fd);
>   #endif
>       return 0;
>   }
> @@ -183,6 +273,8 @@ int vfio_kvm_device_del_fd(int fd, Error **errp)
>                            "Failed to remove fd %d from KVM VFIO device", fd);
>           return -errno;
>       }
> +
> +    vfio_device_fd_list_remove(fd);
>   #endif
>       return 0;
>   }


Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Ani Sinha 1 month, 3 weeks ago

> On 18 Feb 2026, at 7:37 PM, Cédric Le Goater <clg@redhat.com> wrote:
> 
> On 2/18/26 12:42, Ani Sinha wrote:
>> Normally the vfio pseudo device file descriptor lives for the life of the VM.
>> However, when the kvm VM file descriptor changes, a new file descriptor
>> for the pseudo device needs to be generated against the new kvm VM descriptor.
>> Other existing vfio descriptors needs to be reattached to the new pseudo device
>> descriptor. This change performs the above steps.
>> Tested-by: Cédric Le Goater <clg@redhat.com>
> 
> There is a regression since last version.
> 
> 
> 'reboot' from the guest and command 'system_reset' from the QEMU
> monitor now generate these outputs:
> 
>  qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>  qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>  qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>  ...
> 
> and QEMU exits after a while.

I have only seen this in SEV-ES with more than one vcpus. Never with TDX or SEV-SNP (single or multiple cpus).
On which host/guest type did you see this?

> 
> 
> 
>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> 
> Anyhow this patch looks good.
> 
> Reviewed-by: Cédric Le Goater <clg@redhat.com>
> 
> Thanks,
> 
> C.
> 
>> ---
>>  hw/vfio/helpers.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 92 insertions(+)
>> diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
>> index f68f8165d0..e2bedd15ec 100644
>> --- a/hw/vfio/helpers.c
>> +++ b/hw/vfio/helpers.c
>> @@ -116,6 +116,89 @@ bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
>>   * we'll re-use it should another vfio device be attached before then.
>>   */
>>  int vfio_kvm_device_fd = -1;
>> +
>> +/*
>> + * Confidential virtual machines:
>> + * During reset of confidential vms, the kvm vm file descriptor changes.
>> + * In this case, the old vfio kvm file descriptor is
>> + * closed and a new descriptor is created against the new kvm vm file
>> + * descriptor.
>> + */
>> +
>> +typedef struct VFIODeviceFd {
>> +    int fd;
>> +    QLIST_ENTRY(VFIODeviceFd) node;
>> +} VFIODeviceFd;
>> +
>> +static QLIST_HEAD(, VFIODeviceFd) vfio_device_fds =
>> +    QLIST_HEAD_INITIALIZER(vfio_device_fds);
>> +
>> +static void vfio_device_fd_list_add(int fd)
>> +{
>> +    VFIODeviceFd *file_fd;
>> +    file_fd = g_malloc0(sizeof(*file_fd));
>> +    file_fd->fd = fd;
>> +    QLIST_INSERT_HEAD(&vfio_device_fds, file_fd, node);
>> +}
>> +
>> +static void vfio_device_fd_list_remove(int fd)
>> +{
>> +    VFIODeviceFd *file_fd, *next;
>> +
>> +    QLIST_FOREACH_SAFE(file_fd, &vfio_device_fds, node, next) {
>> +        if (file_fd->fd == fd) {
>> +            QLIST_REMOVE(file_fd, node);
>> +            g_free(file_fd);
>> +            break;
>> +        }
>> +    }
>> +}
>> +
>> +static int vfio_device_fd_rebind(NotifierWithReturn *notifier, void *data,
>> +                                  Error **errp)
>> +{
>> +    VFIODeviceFd *file_fd;
>> +    int ret = 0;
>> +    struct kvm_device_attr attr = {
>> +        .group = KVM_DEV_VFIO_FILE,
>> +        .attr = KVM_DEV_VFIO_FILE_ADD,
>> +    };
>> +    struct kvm_create_device cd = {
>> +        .type = KVM_DEV_TYPE_VFIO,
>> +    };
>> +
>> +    /* we are not interested in pre vmfd change notification */
>> +    if (((VmfdChangeNotifier *)data)->pre) {
>> +        return 0;
>> +    }
>> +
>> +    if (kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd)) {
>> +        error_setg_errno(errp, errno, "Failed to create KVM VFIO device");
>> +        return -errno;
>> +    }
>> +
>> +    if (vfio_kvm_device_fd != -1) {
>> +        close(vfio_kvm_device_fd);
>> +    }
>> +
>> +    vfio_kvm_device_fd = cd.fd;
>> +
>> +    QLIST_FOREACH(file_fd, &vfio_device_fds, node) {
>> +        attr.addr = (uint64_t)(unsigned long)&file_fd->fd;
>> +        if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
>> +            error_setg_errno(errp, errno,
>> +                             "Failed to add fd %d to KVM VFIO device",
>> +                             file_fd->fd);
>> +            ret = -errno;
>> +        }
>> +    }
>> +    return ret;
>> +}
>> +
>> +static struct NotifierWithReturn vfio_vmfd_change_notifier = {
>> +    .notify = vfio_device_fd_rebind,
>> +};
>> +
>>  #endif
>>    void vfio_kvm_device_close(void)
>> @@ -153,6 +236,11 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
>>          }
>>            vfio_kvm_device_fd = cd.fd;
>> +        /*
>> +         * If the vm file descriptor changes, add a notifier so that we can
>> +         * re-create the vfio_kvm_device_fd.
>> +         */
>> +        kvm_vmfd_add_change_notifier(&vfio_vmfd_change_notifier);
>>      }
>>        if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
>> @@ -160,6 +248,8 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
>>                           fd);
>>          return -errno;
>>      }
>> +
>> +    vfio_device_fd_list_add(fd);
>>  #endif
>>      return 0;
>>  }
>> @@ -183,6 +273,8 @@ int vfio_kvm_device_del_fd(int fd, Error **errp)
>>                           "Failed to remove fd %d from KVM VFIO device", fd);
>>          return -errno;
>>      }
>> +
>> +    vfio_device_fd_list_remove(fd);
>>  #endif
>>      return 0;
>>  }
> 
Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Cédric Le Goater 1 month, 3 weeks ago
On 2/18/26 16:07, Ani Sinha wrote:
> 
> 
>> On 18 Feb 2026, at 7:37 PM, Cédric Le Goater <clg@redhat.com> wrote:
>>
>> On 2/18/26 12:42, Ani Sinha wrote:
>>> Normally the vfio pseudo device file descriptor lives for the life of the VM.
>>> However, when the kvm VM file descriptor changes, a new file descriptor
>>> for the pseudo device needs to be generated against the new kvm VM descriptor.
>>> Other existing vfio descriptors needs to be reattached to the new pseudo device
>>> descriptor. This change performs the above steps.
>>> Tested-by: Cédric Le Goater <clg@redhat.com>
>>
>> There is a regression since last version.
>>
>>
>> 'reboot' from the guest and command 'system_reset' from the QEMU
>> monitor now generate these outputs:
>>
>>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>   ...
>>
>> and QEMU exits after a while.
> 
> I have only seen this in SEV-ES with more than one vcpus. Never with TDX or SEV-SNP (single or multiple cpus).
> On which host/guest type did you see this?

SEV-SNP on a RHEL9 host. Same guest I used before and host says :

   [1816531.409591] kvm_amd: SEV-ES guest requested termination: 0x0:0x0

Thanks,

C.



> 
>>
>>
>>
>>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>>
>> Anyhow this patch looks good.
>>
>> Reviewed-by: Cédric Le Goater <clg@redhat.com>
>>
>> Thanks,
>>
>> C.
>>
>>> ---
>>>   hw/vfio/helpers.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 92 insertions(+)
>>> diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
>>> index f68f8165d0..e2bedd15ec 100644
>>> --- a/hw/vfio/helpers.c
>>> +++ b/hw/vfio/helpers.c
>>> @@ -116,6 +116,89 @@ bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
>>>    * we'll re-use it should another vfio device be attached before then.
>>>    */
>>>   int vfio_kvm_device_fd = -1;
>>> +
>>> +/*
>>> + * Confidential virtual machines:
>>> + * During reset of confidential vms, the kvm vm file descriptor changes.
>>> + * In this case, the old vfio kvm file descriptor is
>>> + * closed and a new descriptor is created against the new kvm vm file
>>> + * descriptor.
>>> + */
>>> +
>>> +typedef struct VFIODeviceFd {
>>> +    int fd;
>>> +    QLIST_ENTRY(VFIODeviceFd) node;
>>> +} VFIODeviceFd;
>>> +
>>> +static QLIST_HEAD(, VFIODeviceFd) vfio_device_fds =
>>> +    QLIST_HEAD_INITIALIZER(vfio_device_fds);
>>> +
>>> +static void vfio_device_fd_list_add(int fd)
>>> +{
>>> +    VFIODeviceFd *file_fd;
>>> +    file_fd = g_malloc0(sizeof(*file_fd));
>>> +    file_fd->fd = fd;
>>> +    QLIST_INSERT_HEAD(&vfio_device_fds, file_fd, node);
>>> +}
>>> +
>>> +static void vfio_device_fd_list_remove(int fd)
>>> +{
>>> +    VFIODeviceFd *file_fd, *next;
>>> +
>>> +    QLIST_FOREACH_SAFE(file_fd, &vfio_device_fds, node, next) {
>>> +        if (file_fd->fd == fd) {
>>> +            QLIST_REMOVE(file_fd, node);
>>> +            g_free(file_fd);
>>> +            break;
>>> +        }
>>> +    }
>>> +}
>>> +
>>> +static int vfio_device_fd_rebind(NotifierWithReturn *notifier, void *data,
>>> +                                  Error **errp)
>>> +{
>>> +    VFIODeviceFd *file_fd;
>>> +    int ret = 0;
>>> +    struct kvm_device_attr attr = {
>>> +        .group = KVM_DEV_VFIO_FILE,
>>> +        .attr = KVM_DEV_VFIO_FILE_ADD,
>>> +    };
>>> +    struct kvm_create_device cd = {
>>> +        .type = KVM_DEV_TYPE_VFIO,
>>> +    };
>>> +
>>> +    /* we are not interested in pre vmfd change notification */
>>> +    if (((VmfdChangeNotifier *)data)->pre) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    if (kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd)) {
>>> +        error_setg_errno(errp, errno, "Failed to create KVM VFIO device");
>>> +        return -errno;
>>> +    }
>>> +
>>> +    if (vfio_kvm_device_fd != -1) {
>>> +        close(vfio_kvm_device_fd);
>>> +    }
>>> +
>>> +    vfio_kvm_device_fd = cd.fd;
>>> +
>>> +    QLIST_FOREACH(file_fd, &vfio_device_fds, node) {
>>> +        attr.addr = (uint64_t)(unsigned long)&file_fd->fd;
>>> +        if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
>>> +            error_setg_errno(errp, errno,
>>> +                             "Failed to add fd %d to KVM VFIO device",
>>> +                             file_fd->fd);
>>> +            ret = -errno;
>>> +        }
>>> +    }
>>> +    return ret;
>>> +}
>>> +
>>> +static struct NotifierWithReturn vfio_vmfd_change_notifier = {
>>> +    .notify = vfio_device_fd_rebind,
>>> +};
>>> +
>>>   #endif
>>>     void vfio_kvm_device_close(void)
>>> @@ -153,6 +236,11 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
>>>           }
>>>             vfio_kvm_device_fd = cd.fd;
>>> +        /*
>>> +         * If the vm file descriptor changes, add a notifier so that we can
>>> +         * re-create the vfio_kvm_device_fd.
>>> +         */
>>> +        kvm_vmfd_add_change_notifier(&vfio_vmfd_change_notifier);
>>>       }
>>>         if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
>>> @@ -160,6 +248,8 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
>>>                            fd);
>>>           return -errno;
>>>       }
>>> +
>>> +    vfio_device_fd_list_add(fd);
>>>   #endif
>>>       return 0;
>>>   }
>>> @@ -183,6 +273,8 @@ int vfio_kvm_device_del_fd(int fd, Error **errp)
>>>                            "Failed to remove fd %d from KVM VFIO device", fd);
>>>           return -errno;
>>>       }
>>> +
>>> +    vfio_device_fd_list_remove(fd);
>>>   #endif
>>>       return 0;
>>>   }
>>
> 


Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Ani Sinha 1 month, 3 weeks ago
On Wed, Feb 18, 2026 at 9:00 PM Cédric Le Goater <clg@redhat.com> wrote:
>
> On 2/18/26 16:07, Ani Sinha wrote:
> >
> >
> >> On 18 Feb 2026, at 7:37 PM, Cédric Le Goater <clg@redhat.com> wrote:
> >>
> >> On 2/18/26 12:42, Ani Sinha wrote:
> >>> Normally the vfio pseudo device file descriptor lives for the life of the VM.
> >>> However, when the kvm VM file descriptor changes, a new file descriptor
> >>> for the pseudo device needs to be generated against the new kvm VM descriptor.
> >>> Other existing vfio descriptors needs to be reattached to the new pseudo device
> >>> descriptor. This change performs the above steps.
> >>> Tested-by: Cédric Le Goater <clg@redhat.com>
> >>
> >> There is a regression since last version.
> >>
> >>
> >> 'reboot' from the guest and command 'system_reset' from the QEMU
> >> monitor now generate these outputs:
> >>
> >>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> >>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> >>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> >>   ...
> >>
> >> and QEMU exits after a while.
> >
> > I have only seen this in SEV-ES with more than one vcpus. Never with TDX or SEV-SNP (single or multiple cpus).
> > On which host/guest type did you see this?
>
> SEV-SNP on a RHEL9 host. Same guest I used before and host says :
>
>    [1816531.409591] kvm_amd: SEV-ES guest requested termination: 0x0:0x0

Strange! I am not sure why KVM thinks it's SEV-ES. I have done all my
SEV-SNP and TDX testing on a fc43 host and for SEV-ES I used a fc42
host. I have not seen this kind of guest termination on SEV-SNP or TDX
on that host. I am sure there are some differences between the RHEL9
host kernel and fc43 kernel.

>
> Thanks,
>
> C.
>
>
>
> >
> >>
> >>
> >>
> >>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> >>
> >> Anyhow this patch looks good.
> >>
> >> Reviewed-by: Cédric Le Goater <clg@redhat.com>
> >>
> >> Thanks,
> >>
> >> C.
> >>
> >>> ---
> >>>   hw/vfio/helpers.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
> >>>   1 file changed, 92 insertions(+)
> >>> diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
> >>> index f68f8165d0..e2bedd15ec 100644
> >>> --- a/hw/vfio/helpers.c
> >>> +++ b/hw/vfio/helpers.c
> >>> @@ -116,6 +116,89 @@ bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
> >>>    * we'll re-use it should another vfio device be attached before then.
> >>>    */
> >>>   int vfio_kvm_device_fd = -1;
> >>> +
> >>> +/*
> >>> + * Confidential virtual machines:
> >>> + * During reset of confidential vms, the kvm vm file descriptor changes.
> >>> + * In this case, the old vfio kvm file descriptor is
> >>> + * closed and a new descriptor is created against the new kvm vm file
> >>> + * descriptor.
> >>> + */
> >>> +
> >>> +typedef struct VFIODeviceFd {
> >>> +    int fd;
> >>> +    QLIST_ENTRY(VFIODeviceFd) node;
> >>> +} VFIODeviceFd;
> >>> +
> >>> +static QLIST_HEAD(, VFIODeviceFd) vfio_device_fds =
> >>> +    QLIST_HEAD_INITIALIZER(vfio_device_fds);
> >>> +
> >>> +static void vfio_device_fd_list_add(int fd)
> >>> +{
> >>> +    VFIODeviceFd *file_fd;
> >>> +    file_fd = g_malloc0(sizeof(*file_fd));
> >>> +    file_fd->fd = fd;
> >>> +    QLIST_INSERT_HEAD(&vfio_device_fds, file_fd, node);
> >>> +}
> >>> +
> >>> +static void vfio_device_fd_list_remove(int fd)
> >>> +{
> >>> +    VFIODeviceFd *file_fd, *next;
> >>> +
> >>> +    QLIST_FOREACH_SAFE(file_fd, &vfio_device_fds, node, next) {
> >>> +        if (file_fd->fd == fd) {
> >>> +            QLIST_REMOVE(file_fd, node);
> >>> +            g_free(file_fd);
> >>> +            break;
> >>> +        }
> >>> +    }
> >>> +}
> >>> +
> >>> +static int vfio_device_fd_rebind(NotifierWithReturn *notifier, void *data,
> >>> +                                  Error **errp)
> >>> +{
> >>> +    VFIODeviceFd *file_fd;
> >>> +    int ret = 0;
> >>> +    struct kvm_device_attr attr = {
> >>> +        .group = KVM_DEV_VFIO_FILE,
> >>> +        .attr = KVM_DEV_VFIO_FILE_ADD,
> >>> +    };
> >>> +    struct kvm_create_device cd = {
> >>> +        .type = KVM_DEV_TYPE_VFIO,
> >>> +    };
> >>> +
> >>> +    /* we are not interested in pre vmfd change notification */
> >>> +    if (((VmfdChangeNotifier *)data)->pre) {
> >>> +        return 0;
> >>> +    }
> >>> +
> >>> +    if (kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd)) {
> >>> +        error_setg_errno(errp, errno, "Failed to create KVM VFIO device");
> >>> +        return -errno;
> >>> +    }
> >>> +
> >>> +    if (vfio_kvm_device_fd != -1) {
> >>> +        close(vfio_kvm_device_fd);
> >>> +    }
> >>> +
> >>> +    vfio_kvm_device_fd = cd.fd;
> >>> +
> >>> +    QLIST_FOREACH(file_fd, &vfio_device_fds, node) {
> >>> +        attr.addr = (uint64_t)(unsigned long)&file_fd->fd;
> >>> +        if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
> >>> +            error_setg_errno(errp, errno,
> >>> +                             "Failed to add fd %d to KVM VFIO device",
> >>> +                             file_fd->fd);
> >>> +            ret = -errno;
> >>> +        }
> >>> +    }
> >>> +    return ret;
> >>> +}
> >>> +
> >>> +static struct NotifierWithReturn vfio_vmfd_change_notifier = {
> >>> +    .notify = vfio_device_fd_rebind,
> >>> +};
> >>> +
> >>>   #endif
> >>>     void vfio_kvm_device_close(void)
> >>> @@ -153,6 +236,11 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
> >>>           }
> >>>             vfio_kvm_device_fd = cd.fd;
> >>> +        /*
> >>> +         * If the vm file descriptor changes, add a notifier so that we can
> >>> +         * re-create the vfio_kvm_device_fd.
> >>> +         */
> >>> +        kvm_vmfd_add_change_notifier(&vfio_vmfd_change_notifier);
> >>>       }
> >>>         if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
> >>> @@ -160,6 +248,8 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
> >>>                            fd);
> >>>           return -errno;
> >>>       }
> >>> +
> >>> +    vfio_device_fd_list_add(fd);
> >>>   #endif
> >>>       return 0;
> >>>   }
> >>> @@ -183,6 +273,8 @@ int vfio_kvm_device_del_fd(int fd, Error **errp)
> >>>                            "Failed to remove fd %d from KVM VFIO device", fd);
> >>>           return -errno;
> >>>       }
> >>> +
> >>> +    vfio_device_fd_list_remove(fd);
> >>>   #endif
> >>>       return 0;
> >>>   }
> >>
> >
>
Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Gerd Hoffmann 1 month, 2 weeks ago
  Hi,

> > SEV-SNP on a RHEL9 host. Same guest I used before and host says :
> >
> >    [1816531.409591] kvm_amd: SEV-ES guest requested termination: 0x0:0x0
> 
> Strange! I am not sure why KVM thinks it's SEV-ES.

Historical reasons / just not checking I guess.  It's a code path which
was added with SEV-ES support and it is used with SEV-SNP too.  So it
means "SEV-ES or newer".

take care,
  Gerd
Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Cédric Le Goater 1 month, 3 weeks ago
On 2/18/26 18:33, Ani Sinha wrote:
> On Wed, Feb 18, 2026 at 9:00 PM Cédric Le Goater <clg@redhat.com> wrote:
>>
>> On 2/18/26 16:07, Ani Sinha wrote:
>>>
>>>
>>>> On 18 Feb 2026, at 7:37 PM, Cédric Le Goater <clg@redhat.com> wrote:
>>>>
>>>> On 2/18/26 12:42, Ani Sinha wrote:
>>>>> Normally the vfio pseudo device file descriptor lives for the life of the VM.
>>>>> However, when the kvm VM file descriptor changes, a new file descriptor
>>>>> for the pseudo device needs to be generated against the new kvm VM descriptor.
>>>>> Other existing vfio descriptors needs to be reattached to the new pseudo device
>>>>> descriptor. This change performs the above steps.
>>>>> Tested-by: Cédric Le Goater <clg@redhat.com>
>>>>
>>>> There is a regression since last version.
>>>>
>>>>
>>>> 'reboot' from the guest and command 'system_reset' from the QEMU
>>>> monitor now generate these outputs:
>>>>
>>>>    qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>>>    qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>>>    qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>>>    ...
>>>>
>>>> and QEMU exits after a while.
>>>
>>> I have only seen this in SEV-ES with more than one vcpus. Never with TDX or SEV-SNP (single or multiple cpus).
>>> On which host/guest type did you see this?
>>
>> SEV-SNP on a RHEL9 host. Same guest I used before and host says :
>>
>>     [1816531.409591] kvm_amd: SEV-ES guest requested termination: 0x0:0x0
> 
> Strange! I am not sure why KVM thinks it's SEV-ES. I have done all my
> SEV-SNP and TDX testing on a fc43 host and for SEV-ES I used a fc42
> host. I have not seen this kind of guest termination on SEV-SNP or TDX
> on that host. I am sure there are some differences between the RHEL9
> host kernel and fc43 kernel.

The same issue occurs with a rhel10 kernel.

C.


> 
>>
>> Thanks,
>>
>> C.
>>
>>
>>
>>>
>>>>
>>>>
>>>>
>>>>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>>>>
>>>> Anyhow this patch looks good.
>>>>
>>>> Reviewed-by: Cédric Le Goater <clg@redhat.com>
>>>>
>>>> Thanks,
>>>>
>>>> C.
>>>>
>>>>> ---
>>>>>    hw/vfio/helpers.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>    1 file changed, 92 insertions(+)
>>>>> diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
>>>>> index f68f8165d0..e2bedd15ec 100644
>>>>> --- a/hw/vfio/helpers.c
>>>>> +++ b/hw/vfio/helpers.c
>>>>> @@ -116,6 +116,89 @@ bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
>>>>>     * we'll re-use it should another vfio device be attached before then.
>>>>>     */
>>>>>    int vfio_kvm_device_fd = -1;
>>>>> +
>>>>> +/*
>>>>> + * Confidential virtual machines:
>>>>> + * During reset of confidential vms, the kvm vm file descriptor changes.
>>>>> + * In this case, the old vfio kvm file descriptor is
>>>>> + * closed and a new descriptor is created against the new kvm vm file
>>>>> + * descriptor.
>>>>> + */
>>>>> +
>>>>> +typedef struct VFIODeviceFd {
>>>>> +    int fd;
>>>>> +    QLIST_ENTRY(VFIODeviceFd) node;
>>>>> +} VFIODeviceFd;
>>>>> +
>>>>> +static QLIST_HEAD(, VFIODeviceFd) vfio_device_fds =
>>>>> +    QLIST_HEAD_INITIALIZER(vfio_device_fds);
>>>>> +
>>>>> +static void vfio_device_fd_list_add(int fd)
>>>>> +{
>>>>> +    VFIODeviceFd *file_fd;
>>>>> +    file_fd = g_malloc0(sizeof(*file_fd));
>>>>> +    file_fd->fd = fd;
>>>>> +    QLIST_INSERT_HEAD(&vfio_device_fds, file_fd, node);
>>>>> +}
>>>>> +
>>>>> +static void vfio_device_fd_list_remove(int fd)
>>>>> +{
>>>>> +    VFIODeviceFd *file_fd, *next;
>>>>> +
>>>>> +    QLIST_FOREACH_SAFE(file_fd, &vfio_device_fds, node, next) {
>>>>> +        if (file_fd->fd == fd) {
>>>>> +            QLIST_REMOVE(file_fd, node);
>>>>> +            g_free(file_fd);
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static int vfio_device_fd_rebind(NotifierWithReturn *notifier, void *data,
>>>>> +                                  Error **errp)
>>>>> +{
>>>>> +    VFIODeviceFd *file_fd;
>>>>> +    int ret = 0;
>>>>> +    struct kvm_device_attr attr = {
>>>>> +        .group = KVM_DEV_VFIO_FILE,
>>>>> +        .attr = KVM_DEV_VFIO_FILE_ADD,
>>>>> +    };
>>>>> +    struct kvm_create_device cd = {
>>>>> +        .type = KVM_DEV_TYPE_VFIO,
>>>>> +    };
>>>>> +
>>>>> +    /* we are not interested in pre vmfd change notification */
>>>>> +    if (((VmfdChangeNotifier *)data)->pre) {
>>>>> +        return 0;
>>>>> +    }
>>>>> +
>>>>> +    if (kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd)) {
>>>>> +        error_setg_errno(errp, errno, "Failed to create KVM VFIO device");
>>>>> +        return -errno;
>>>>> +    }
>>>>> +
>>>>> +    if (vfio_kvm_device_fd != -1) {
>>>>> +        close(vfio_kvm_device_fd);
>>>>> +    }
>>>>> +
>>>>> +    vfio_kvm_device_fd = cd.fd;
>>>>> +
>>>>> +    QLIST_FOREACH(file_fd, &vfio_device_fds, node) {
>>>>> +        attr.addr = (uint64_t)(unsigned long)&file_fd->fd;
>>>>> +        if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
>>>>> +            error_setg_errno(errp, errno,
>>>>> +                             "Failed to add fd %d to KVM VFIO device",
>>>>> +                             file_fd->fd);
>>>>> +            ret = -errno;
>>>>> +        }
>>>>> +    }
>>>>> +    return ret;
>>>>> +}
>>>>> +
>>>>> +static struct NotifierWithReturn vfio_vmfd_change_notifier = {
>>>>> +    .notify = vfio_device_fd_rebind,
>>>>> +};
>>>>> +
>>>>>    #endif
>>>>>      void vfio_kvm_device_close(void)
>>>>> @@ -153,6 +236,11 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
>>>>>            }
>>>>>              vfio_kvm_device_fd = cd.fd;
>>>>> +        /*
>>>>> +         * If the vm file descriptor changes, add a notifier so that we can
>>>>> +         * re-create the vfio_kvm_device_fd.
>>>>> +         */
>>>>> +        kvm_vmfd_add_change_notifier(&vfio_vmfd_change_notifier);
>>>>>        }
>>>>>          if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
>>>>> @@ -160,6 +248,8 @@ int vfio_kvm_device_add_fd(int fd, Error **errp)
>>>>>                             fd);
>>>>>            return -errno;
>>>>>        }
>>>>> +
>>>>> +    vfio_device_fd_list_add(fd);
>>>>>    #endif
>>>>>        return 0;
>>>>>    }
>>>>> @@ -183,6 +273,8 @@ int vfio_kvm_device_del_fd(int fd, Error **errp)
>>>>>                             "Failed to remove fd %d from KVM VFIO device", fd);
>>>>>            return -errno;
>>>>>        }
>>>>> +
>>>>> +    vfio_device_fd_list_remove(fd);
>>>>>    #endif
>>>>>        return 0;
>>>>>    }
>>>>
>>>
>>
> 


Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Ani Sinha 1 month, 3 weeks ago

> On 18 Feb 2026, at 11:09 PM, Cédric Le Goater <clg@redhat.com> wrote:
> 
> On 2/18/26 18:33, Ani Sinha wrote:
>> On Wed, Feb 18, 2026 at 9:00 PM Cédric Le Goater <clg@redhat.com> wrote:
>>> 
>>> On 2/18/26 16:07, Ani Sinha wrote:
>>>> 
>>>> 
>>>>> On 18 Feb 2026, at 7:37 PM, Cédric Le Goater <clg@redhat.com> wrote:
>>>>> 
>>>>> On 2/18/26 12:42, Ani Sinha wrote:
>>>>>> Normally the vfio pseudo device file descriptor lives for the life of the VM.
>>>>>> However, when the kvm VM file descriptor changes, a new file descriptor
>>>>>> for the pseudo device needs to be generated against the new kvm VM descriptor.
>>>>>> Other existing vfio descriptors needs to be reattached to the new pseudo device
>>>>>> descriptor. This change performs the above steps.
>>>>>> Tested-by: Cédric Le Goater <clg@redhat.com>
>>>>> 
>>>>> There is a regression since last version.
>>>>> 
>>>>> 
>>>>> 'reboot' from the guest and command 'system_reset' from the QEMU
>>>>> monitor now generate these outputs:
>>>>> 
>>>>>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>>>>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>>>>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>>>>   ...
>>>>> 
>>>>> and QEMU exits after a while.
>>>> 
>>>> I have only seen this in SEV-ES with more than one vcpus. Never with TDX or SEV-SNP (single or multiple cpus).
>>>> On which host/guest type did you see this?
>>> 
>>> SEV-SNP on a RHEL9 host. Same guest I used before and host says :
>>> 
>>>    [1816531.409591] kvm_amd: SEV-ES guest requested termination: 0x0:0x0
>> Strange! I am not sure why KVM thinks it's SEV-ES. I have done all my
>> SEV-SNP and TDX testing on a fc43 host and for SEV-ES I used a fc42
>> host. I have not seen this kind of guest termination on SEV-SNP or TDX
>> on that host. I am sure there are some differences between the RHEL9
>> host kernel and fc43 kernel.
> 
> The same issue occurs with a rhel10 kernel.

OK good news is that I see this issue on the latest rhel9 host but not on a fc43 host.
The issue happens only when > 1 cpus are created for the guest. With only one cpu seems its fine.
I tried a minimalistic guest with only a kernel passed as UKI as well as a full blown fc43 guest. Both manifest the same issue with > 1 vcpu threads.

There is a regression indeed between versions. My v3 which I posted just before FOSDEM seems good and does not show this behaviour on the same RHEL 9 host.
Both v4 and v5 are affected. This means regression is somewhere in my changes, not due to host kernel changes.

This shows a little bit more information:

bash-5.1# /usr/sbin/reboot -f
[   43.920835] ACPI: PM: Preparing to enter system sleep state S5
[   43.922988] reboot: Restarting system
[   43.924242] reboot: machine restart
qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00800f12
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000b004 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 00000000 0000ffff 00009300
CS =f000 00800000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

It is likely that the changes towards fixing non-coco reset case with QMP “system_reset” introduced this regression. I will have to audit changes between v3 and v4.
Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Ani Sinha 1 month, 3 weeks ago

On Thu, 19 Feb 2026, Ani Sinha wrote:

>
>
> > On 18 Feb 2026, at 11:09 PM, Cédric Le Goater <clg@redhat.com> wrote:
> >
> > On 2/18/26 18:33, Ani Sinha wrote:
> >> On Wed, Feb 18, 2026 at 9:00 PM Cédric Le Goater <clg@redhat.com> wrote:
> >>>
> >>> On 2/18/26 16:07, Ani Sinha wrote:
> >>>>
> >>>>
> >>>>> On 18 Feb 2026, at 7:37 PM, Cédric Le Goater <clg@redhat.com> wrote:
> >>>>>
> >>>>> On 2/18/26 12:42, Ani Sinha wrote:
> >>>>>> Normally the vfio pseudo device file descriptor lives for the life of the VM.
> >>>>>> However, when the kvm VM file descriptor changes, a new file descriptor
> >>>>>> for the pseudo device needs to be generated against the new kvm VM descriptor.
> >>>>>> Other existing vfio descriptors needs to be reattached to the new pseudo device
> >>>>>> descriptor. This change performs the above steps.
> >>>>>> Tested-by: Cédric Le Goater <clg@redhat.com>
> >>>>>
> >>>>> There is a regression since last version.
> >>>>>
> >>>>>
> >>>>> 'reboot' from the guest and command 'system_reset' from the QEMU
> >>>>> monitor now generate these outputs:
> >>>>>
> >>>>>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> >>>>>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> >>>>>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> >>>>>   ...
> >>>>>
> >>>>> and QEMU exits after a while.
> >>>>
> >>>> I have only seen this in SEV-ES with more than one vcpus. Never with TDX or SEV-SNP (single or multiple cpus).
> >>>> On which host/guest type did you see this?
> >>>
> >>> SEV-SNP on a RHEL9 host. Same guest I used before and host says :
> >>>
> >>>    [1816531.409591] kvm_amd: SEV-ES guest requested termination: 0x0:0x0
> >> Strange! I am not sure why KVM thinks it's SEV-ES. I have done all my
> >> SEV-SNP and TDX testing on a fc43 host and for SEV-ES I used a fc42
> >> host. I have not seen this kind of guest termination on SEV-SNP or TDX
> >> on that host. I am sure there are some differences between the RHEL9
> >> host kernel and fc43 kernel.
> >
> > The same issue occurs with a rhel10 kernel.
>
> OK good news is that I see this issue on the latest rhel9 host but not on a fc43 host.
> The issue happens only when > 1 cpus are created for the guest. With only one cpu seems its fine.
> I tried a minimalistic guest with only a kernel passed as UKI as well as a full blown fc43 guest. Both manifest the same issue with > 1 vcpu threads.
>
> There is a regression indeed between versions. My v3 which I posted just before FOSDEM seems good and does not show this behaviour on the same RHEL 9 host.
> Both v4 and v5 are affected. This means regression is somewhere in my changes, not due to host kernel changes.
>
> This shows a little bit more information:
>
> bash-5.1# /usr/sbin/reboot -f
> [   43.920835] ACPI: PM: Preparing to enter system sleep state S5
> [   43.922988] reboot: Restarting system
> [   43.924242] reboot: machine restart
> qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> EAX=00000000 EBX=00000000 ECX=00000000 EDX=00800f12
> ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
> EIP=0000b004 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
> ES =0000 00000000 0000ffff 00009300
> CS =f000 00800000 0000ffff 00009b00
> SS =0000 00000000 0000ffff 00009300
> DS =0000 00000000 0000ffff 00009300
> FS =0000 00000000 0000ffff 00009300
> GS =0000 00000000 0000ffff 00009300
> LDT=0000 00000000 0000ffff 00008200
> TR =0000 00000000 0000ffff 00008b00
> GDT=     00000000 0000ffff
> IDT=     00000000 0000ffff
> CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
> It is likely that the changes towards fixing non-coco reset case with QMP “system_reset” introduced this regression. I will have to audit changes between v3 and v4.
>

Alright fixed this. Basically I had dropped
https://gitlab.com/anisinha/qemu/-/commit/b413141cd8a5b599214153a5e37b485443885718
but this is needed. Also I need to call cpu_synchronize_all_post_init()
for the coco case as well. This will also set vcpu->dirty = false and this
will make subsequent code paths happy.

please try this patch on top of my v5. It should fix the issue. I tested
on a RHEL 9 host and a fc43 SEV-SNP host, 1 or multiple vcpus, two
different guests and all looks good.

From 26db44eba8c727160dd9e97c9d5582a0ddc5884d Mon Sep 17 00:00:00 2001
From: Ani Sinha <anisinha@redhat.com>
Date: Thu, 19 Feb 2026 13:12:02 +0530
Subject: [PATCH] Call cpu_synchronize_all_post_init for coco as well

Fixes issue reported by Cedric

Signed-off-by: Ani Sinha <anisinha@redhat.com>
---
 accel/kvm/kvm-all.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index d7ea60f582..0610bf8434 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2816,7 +2816,7 @@ static int kvm_reset_vmfd(MachineState *ms)
     }

     s->vmfd = ret;
-
+    kvm_state->guest_state_protected = false;
     kvm_setup_dirty_ring(s);

     /* rebind memory to new vm fd */
@@ -2872,6 +2872,9 @@ static int kvm_reset_vmfd(MachineState *ms)
      * kvm fd has changed. Commit the irq routes to KVM once more.
      */
     kvm_irqchip_commit_routes(s);
+    if (ms->cgs) {
+        cpu_synchronize_all_post_init();
+    }
     trace_kvm_reset_vmfd();
     return ret;
 }
-- 
2.49.0
Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Cédric Le Goater 1 month, 3 weeks ago
> please try this patch on top of my v5. It should fix the issue. I tested
> on a RHEL 9 host and a fc43 SEV-SNP host, 1 or multiple vcpus, two
> different guests and all looks good.
> 
>  From 26db44eba8c727160dd9e97c9d5582a0ddc5884d Mon Sep 17 00:00:00 2001
> From: Ani Sinha <anisinha@redhat.com>
> Date: Thu, 19 Feb 2026 13:12:02 +0530
> Subject: [PATCH] Call cpu_synchronize_all_post_init for coco as well
> 
> Fixes issue reported by Cedric
> 
> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> ---
>   accel/kvm/kvm-all.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index d7ea60f582..0610bf8434 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2816,7 +2816,7 @@ static int kvm_reset_vmfd(MachineState *ms)
>       }
> 
>       s->vmfd = ret;
> -
> +    kvm_state->guest_state_protected = false;
>       kvm_setup_dirty_ring(s);
> 
>       /* rebind memory to new vm fd */
> @@ -2872,6 +2872,9 @@ static int kvm_reset_vmfd(MachineState *ms)
>        * kvm fd has changed. Commit the irq routes to KVM once more.
>        */
>       kvm_irqchip_commit_routes(s);
> +    if (ms->cgs) {
> +        cpu_synchronize_all_post_init();
> +    }
>       trace_kvm_reset_vmfd();
>       return ret;
>   }

Tested (OS reboot, system_reset) with a RHEL10 guest (2 vCPUs) on
a RHEL9 and a RHEL10 host, using a SATA device and active NIC VFs.
All Looks good.

There is still one message :

   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.

which seems normal.

Thanks,

C.
Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Ani Sinha 1 month, 3 weeks ago

> On 19 Feb 2026, at 1:44 PM, Cédric Le Goater <clg@redhat.com> wrote:
> 
>> please try this patch on top of my v5. It should fix the issue. I tested
>> on a RHEL 9 host and a fc43 SEV-SNP host, 1 or multiple vcpus, two
>> different guests and all looks good.
>> From 26db44eba8c727160dd9e97c9d5582a0ddc5884d Mon Sep 17 00:00:00 2001
>> From: Ani Sinha <anisinha@redhat.com>
>> Date: Thu, 19 Feb 2026 13:12:02 +0530
>> Subject: [PATCH] Call cpu_synchronize_all_post_init for coco as well
>> Fixes issue reported by Cedric
>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>> ---
>>  accel/kvm/kvm-all.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index d7ea60f582..0610bf8434 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -2816,7 +2816,7 @@ static int kvm_reset_vmfd(MachineState *ms)
>>      }
>>      s->vmfd = ret;
>> -
>> +    kvm_state->guest_state_protected = false;
>>      kvm_setup_dirty_ring(s);
>>      /* rebind memory to new vm fd */
>> @@ -2872,6 +2872,9 @@ static int kvm_reset_vmfd(MachineState *ms)
>>       * kvm fd has changed. Commit the irq routes to KVM once more.
>>       */
>>      kvm_irqchip_commit_routes(s);
>> +    if (ms->cgs) {
>> +        cpu_synchronize_all_post_init();
>> +    }
>>      trace_kvm_reset_vmfd();
>>      return ret;
>>  }
> 
> Tested (OS reboot, system_reset) with a RHEL10 guest (2 vCPUs) on
> a RHEL9 and a RHEL10 host, using a SATA device and active NIC VFs.
> All Looks good.

Excellent. Also tested on a TDX host with RHEL 9.6 and seems good as well. Tested running my integration tests on non-coco and all is fine there too.


> 
> There is still one message :
> 
>  qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> 
> which seems normal.
> 
> Thanks,
> 
> C.
> 
> 
> 
Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Ani Sinha 1 month, 3 weeks ago
On Wed, Feb 18, 2026 at 9:00 PM Cédric Le Goater <clg@redhat.com> wrote:
>
> On 2/18/26 16:07, Ani Sinha wrote:
> >
> >
> >> On 18 Feb 2026, at 7:37 PM, Cédric Le Goater <clg@redhat.com> wrote:
> >>
> >> On 2/18/26 12:42, Ani Sinha wrote:
> >>> Normally the vfio pseudo device file descriptor lives for the life of the VM.
> >>> However, when the kvm VM file descriptor changes, a new file descriptor
> >>> for the pseudo device needs to be generated against the new kvm VM descriptor.
> >>> Other existing vfio descriptors needs to be reattached to the new pseudo device
> >>> descriptor. This change performs the above steps.
> >>> Tested-by: Cédric Le Goater <clg@redhat.com>
> >>
> >> There is a regression since last version.
> >>
> >>
> >> 'reboot' from the guest and command 'system_reset' from the QEMU
> >> monitor now generate these outputs:
> >>
> >>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> >>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> >>   qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
> >>   ...
> >>
> >> and QEMU exits after a while.
> >
> > I have only seen this in SEV-ES with more than one vcpus. Never with TDX or SEV-SNP (single or multiple cpus).
> > On which host/guest type did you see this?
>
> SEV-SNP on a RHEL9 host. Same guest I used before and host says :
>
>    [1816531.409591] kvm_amd: SEV-ES guest requested termination: 0x0:0x0

Ok so the guest is SEV-ES. most likely you are also using > 1 vcpu.
Try with one vcpu and/or enabling SEV-SNP.
Re: [PATCH v5 22/34] hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Posted by Cédric Le Goater 1 month, 3 weeks ago
On 2/18/26 17:06, Ani Sinha wrote:
> On Wed, Feb 18, 2026 at 9:00 PM Cédric Le Goater <clg@redhat.com> wrote:
>>
>> On 2/18/26 16:07, Ani Sinha wrote:
>>>
>>>
>>>> On 18 Feb 2026, at 7:37 PM, Cédric Le Goater <clg@redhat.com> wrote:
>>>>
>>>> On 2/18/26 12:42, Ani Sinha wrote:
>>>>> Normally the vfio pseudo device file descriptor lives for the life of the VM.
>>>>> However, when the kvm VM file descriptor changes, a new file descriptor
>>>>> for the pseudo device needs to be generated against the new kvm VM descriptor.
>>>>> Other existing vfio descriptors needs to be reattached to the new pseudo device
>>>>> descriptor. This change performs the above steps.
>>>>> Tested-by: Cédric Le Goater <clg@redhat.com>
>>>>
>>>> There is a regression since last version.
>>>>
>>>>
>>>> 'reboot' from the guest and command 'system_reset' from the QEMU
>>>> monitor now generate these outputs:
>>>>
>>>>    qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>>>    qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>>>    qemu-system-x86_64: info: virtual machine state has been rebuilt with new guest file handle.
>>>>    ...
>>>>
>>>> and QEMU exits after a while.
>>>
>>> I have only seen this in SEV-ES with more than one vcpus. Never with TDX or SEV-SNP (single or multiple cpus).
>>> On which host/guest type did you see this?
>>
>> SEV-SNP on a RHEL9 host. Same guest I used before and host says :
>>
>>     [1816531.409591] kvm_amd: SEV-ES guest requested termination: 0x0:0x0
> 
> Ok so the guest is SEV-ES. most likely you are also using > 1 vcpu.
> Try with one vcpu and/or enabling SEV-SNP.
> 

The guest is SEV-SNP and has 2 cpus.

[root@vm12 ~]# dmesg  | grep SEV
[    0.544608] Memory Encryption Features active: AMD SEV SEV-ES SEV-SNP
[    0.545580] SEV: Status: SEV SEV-ES SEV-SNP
[    0.663592] SEV: APIC: wakeup_secondary_cpu() replaced with wakeup_cpu_via_vmgexit()
[    0.719630] SEV: Using SNP CPUID table, 32 entries present.
[    0.719636] SEV: SNP running at VMPL0.
[    1.043748] SEV: SNP guest platform devices initialized.
[    4.530555] sev-guest sev-guest: Initialized SEV guest driver (using vmpck_id 0)
vmpck_id 0)
[root@vm12 ~]# lscpu
Architecture:                x86_64
   CPU op-mode(s):            32-bit, 64-bit
   Address sizes:             40 bits physical, 48 bits virtual
   Byte Order:                Little Endian
CPU(s):                      2
   On-line CPU(s) list:       0,1
Vendor ID:                   AuthenticAMD
   BIOS Vendor ID:            QEMU
   Model name:                AMD EPYC-v4 Processor
     BIOS Model name:         pc-q35-11.0
     ....