[PATCH v1] vmstate: fix subsection load name check

Alexandr Moshkov posted 1 patch 1 month, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260220102628.459681-1-dtalexundeer@yandex-team.ru
Maintainers: Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
There is a newer version of this series
migration/vmstate.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
[PATCH v1] vmstate: fix subsection load name check
Posted by Alexandr Moshkov 1 month, 2 weeks ago
When loading a subset, its name is checked for the parent prefix. The
following bug may occur here:

Let's say there is a vmstate named "virtio-blk", it has a subsection
named "virtio-blk/subsection", and it also has another vmstate named
"virtio" in the fields.
Then, during the migration, when trying to load this subsection for
"virtio", the prefix condition will pass for "virtio-blk/subsection" and
then the migration will break, because this vmstate does not have such a
subsection.

In other words, if a field inside vmstate1 is set via vmstate2 with a
name that is a prefix of the parent vmstate, then the field can "steal"
a subsection belonging to the parent state.

Fix it by checking `/` at the end of idstr.

Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
---
 migration/vmstate.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/migration/vmstate.c b/migration/vmstate.c
index 4d28364f7b..e9901ee349 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -613,7 +613,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
 
     while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
         char idstr[256], *idstr_ret;
-        int ret;
+        int ret, vmsd_name_len;
         uint8_t version_id, len, size;
         const VMStateDescription *sub_vmsd;
 
@@ -631,7 +631,9 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
         memcpy(idstr, idstr_ret, size);
         idstr[size] = 0;
 
-        if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) {
+        vmsd_name_len = strlen(vmsd->name);
+        if (strncmp(vmsd->name, idstr, vmsd_name_len) != 0 ||
+            idstr[vmsd_name_len] != '/') {
             trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(prefix)");
             /* it doesn't have a valid subsection name */
             return 0;
-- 
2.34.1
Re: [PATCH v1] vmstate: fix subsection load name check
Posted by Peter Xu 1 month, 2 weeks ago
On Fri, Feb 20, 2026 at 03:26:29PM +0500, Alexandr Moshkov wrote:
> When loading a subset, its name is checked for the parent prefix. The
> following bug may occur here:
> 
> Let's say there is a vmstate named "virtio-blk", it has a subsection
> named "virtio-blk/subsection", and it also has another vmstate named
> "virtio" in the fields.
> Then, during the migration, when trying to load this subsection for
> "virtio", the prefix condition will pass for "virtio-blk/subsection" and
> then the migration will break, because this vmstate does not have such a
> subsection.
> 
> In other words, if a field inside vmstate1 is set via vmstate2 with a
> name that is a prefix of the parent vmstate, then the field can "steal"
> a subsection belonging to the parent state.
> 
> Fix it by checking `/` at the end of idstr.

Checking versus '\' looks reasonable, however I'm still confused on the
example given, and what problem you hit.

Here, your concern seems to be that vmstate_subsection_load() can
accidentally load a FIELD of the parent VMSD whose name is exactly the name
of the parent VMSD (which will be the prefix of all subsections).

Now my question is, when reaching the line you modified below, it needs to
be prefixed with QEMU_VM_SUBSECTION.  It means the src QEMU is dumping a
subsection rather than a field.  OTOH, when dumping a field, we never dump
any name; I don't think we name FIELD at all..

Could you share the failure you hit in real life?  That might help to
understand the problem on its own.

Thanks,

> 
> Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
> ---
>  migration/vmstate.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/vmstate.c b/migration/vmstate.c
> index 4d28364f7b..e9901ee349 100644
> --- a/migration/vmstate.c
> +++ b/migration/vmstate.c
> @@ -613,7 +613,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>  
>      while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
>          char idstr[256], *idstr_ret;
> -        int ret;
> +        int ret, vmsd_name_len;
>          uint8_t version_id, len, size;
>          const VMStateDescription *sub_vmsd;
>  
> @@ -631,7 +631,9 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>          memcpy(idstr, idstr_ret, size);
>          idstr[size] = 0;
>  
> -        if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) {
> +        vmsd_name_len = strlen(vmsd->name);
> +        if (strncmp(vmsd->name, idstr, vmsd_name_len) != 0 ||
> +            idstr[vmsd_name_len] != '/') {
>              trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(prefix)");
>              /* it doesn't have a valid subsection name */
>              return 0;
> -- 
> 2.34.1
> 

-- 
Peter Xu
Re: [PATCH v1] vmstate: fix subsection load name check
Posted by Alexandr Moshkov 1 month, 2 weeks ago
On 2/23/26 23:58, Peter Xu wrote:
> On Fri, Feb 20, 2026 at 03:26:29PM +0500, Alexandr Moshkov wrote:
>> When loading a subset, its name is checked for the parent prefix. The
>> following bug may occur here:
>>
>> Let's say there is a vmstate named "virtio-blk", it has a subsection
>> named "virtio-blk/subsection", and it also has another vmstate named
>> "virtio" in the fields.
>> Then, during the migration, when trying to load this subsection for
>> "virtio", the prefix condition will pass for "virtio-blk/subsection" and
>> then the migration will break, because this vmstate does not have such a
>> subsection.
>>
>> In other words, if a field inside vmstate1 is set via vmstate2 with a
>> name that is a prefix of the parent vmstate, then the field can "steal"
>> a subsection belonging to the parent state.
>>
>> Fix it by checking `/` at the end of idstr.
> Checking versus '\' looks reasonable, however I'm still confused on the
> example given, and what problem you hit.
>
> Here, your concern seems to be that vmstate_subsection_load() can
> accidentally load a FIELD of the parent VMSD whose name is exactly the name
> of the parent VMSD (which will be the prefix of all subsections).

Thanks for reply,

vmstate_subsection_load() while trying to load a FIELD (whose name is 
prefix of parent name) of the parent VMSD will try to load parent 
subsection.

>
> Now my question is, when reaching the line you modified below, it needs to
> be prefixed with QEMU_VM_SUBSECTION.  It means the src QEMU is dumping a
> subsection rather than a field.  OTOH, when dumping a field, we never dump
> any name; I don't think we name FIELD at all..
>
> Could you share the failure you hit in real life?  That might help to
> understand the problem on its own.

Here is code example:

```

static const VMStateDescription vmstate_vhost_user_virtio_blk_inflight = {
.name = "virtio-blk/inflight",
.version_id = 2,
.needed = vhost_user_blk_inflight_needed,
.fields = (const VMStateField[]) {
VMSTATE_VHOST_INFLIGHT_REGION(inflight, VHostUserBlk),
VMSTATE_END_OF_LIST()
}
};


static const VMStateDescription vmstate_vhost_user_virtio_blk = {
.name = "virtio-blk",
.minimum_version_id = 2,
.version_id = 2,
.fields = (VMStateField[]) {
VMSTATE_VIRTIO_DEVICE,
VMSTATE_END_OF_LIST()
},
.subsections = (const VMStateDescription * []) {
&vmstate_vhost_user_virtio_blk_inflight,
NULL
}
};

```

VMSTATE_VIRTIO_DEVICE is

```

#define VMSTATE_VIRTIO_DEVICE \
{ \
.name = "virtio", \
.info = &virtio_vmstate_info, \
.flags = VMS_SINGLE, \
}
```
So here is "virtio-blk" vmsd that have "virtio" vmsd field and 
"virtio-blk/inflight" subsection. This configuration will result in the 
fact that inflight vmsd will not be loaded at all (assuming that it met 
all the requirements). Qemu logs will contain 
trace_vmstate_subsection_load_bad (lookup) error for "virtio" vmstate 
when loading "virtio-blk/inflight" subsection.


Thanks

> Signed-off-by: Alexandr Moshkov<dtalexundeer@yandex-team.ru>
> ---
>   migration/vmstate.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/migration/vmstate.c b/migration/vmstate.c
> index 4d28364f7b..e9901ee349 100644
> --- a/migration/vmstate.c
> +++ b/migration/vmstate.c
> @@ -613,7 +613,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>   
>       while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
>           char idstr[256], *idstr_ret;
> -        int ret;
> +        int ret, vmsd_name_len;
>           uint8_t version_id, len, size;
>           const VMStateDescription *sub_vmsd;
>   
> @@ -631,7 +631,9 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>           memcpy(idstr, idstr_ret, size);
>           idstr[size] = 0;
>   
> -        if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) {
> +        vmsd_name_len = strlen(vmsd->name);
> +        if (strncmp(vmsd->name, idstr, vmsd_name_len) != 0 ||
> +            idstr[vmsd_name_len] != '/') {
>               trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(prefix)");
>               /* it doesn't have a valid subsection name */
>               return 0;
> -- 
> 2.34.1
>
>
Re: [PATCH v1] vmstate: fix subsection load name check
Posted by Peter Xu 1 month, 2 weeks ago
On Tue, Feb 24, 2026 at 12:10:20PM +0500, Alexandr Moshkov wrote:
> 
> On 2/23/26 23:58, Peter Xu wrote:
> > On Fri, Feb 20, 2026 at 03:26:29PM +0500, Alexandr Moshkov wrote:
> > > When loading a subset, its name is checked for the parent prefix. The
> > > following bug may occur here:
> > > 
> > > Let's say there is a vmstate named "virtio-blk", it has a subsection
> > > named "virtio-blk/subsection", and it also has another vmstate named
> > > "virtio" in the fields.
> > > Then, during the migration, when trying to load this subsection for
> > > "virtio", the prefix condition will pass for "virtio-blk/subsection" and
> > > then the migration will break, because this vmstate does not have such a
> > > subsection.
> > > 
> > > In other words, if a field inside vmstate1 is set via vmstate2 with a
> > > name that is a prefix of the parent vmstate, then the field can "steal"
> > > a subsection belonging to the parent state.
> > > 
> > > Fix it by checking `/` at the end of idstr.
> > Checking versus '\' looks reasonable, however I'm still confused on the
> > example given, and what problem you hit.
> > 
> > Here, your concern seems to be that vmstate_subsection_load() can
> > accidentally load a FIELD of the parent VMSD whose name is exactly the name
> > of the parent VMSD (which will be the prefix of all subsections).
> 
> Thanks for reply,
> 
> vmstate_subsection_load() while trying to load a FIELD (whose name is prefix
> of parent name) of the parent VMSD will try to load parent subsection.
> 
> > 
> > Now my question is, when reaching the line you modified below, it needs to
> > be prefixed with QEMU_VM_SUBSECTION.  It means the src QEMU is dumping a
> > subsection rather than a field.  OTOH, when dumping a field, we never dump
> > any name; I don't think we name FIELD at all..
> > 
> > Could you share the failure you hit in real life?  That might help to
> > understand the problem on its own.
> 
> Here is code example:
> 
> ```
> 
> static const VMStateDescription vmstate_vhost_user_virtio_blk_inflight = {
> .name = "virtio-blk/inflight",
> .version_id = 2,
> .needed = vhost_user_blk_inflight_needed,
> .fields = (const VMStateField[]) {
> VMSTATE_VHOST_INFLIGHT_REGION(inflight, VHostUserBlk),
> VMSTATE_END_OF_LIST()
> }
> };
> 
> 
> static const VMStateDescription vmstate_vhost_user_virtio_blk = {
> .name = "virtio-blk",
> .minimum_version_id = 2,
> .version_id = 2,
> .fields = (VMStateField[]) {
> VMSTATE_VIRTIO_DEVICE,
> VMSTATE_END_OF_LIST()
> },
> .subsections = (const VMStateDescription * []) {
> &vmstate_vhost_user_virtio_blk_inflight,
> NULL
> }
> };
> 
> ```
> 
> VMSTATE_VIRTIO_DEVICE is
> 
> ```
> 
> #define VMSTATE_VIRTIO_DEVICE \
> { \
> .name = "virtio", \
> .info = &virtio_vmstate_info, \
> .flags = VMS_SINGLE, \
> }
> ```
> So here is "virtio-blk" vmsd that have "virtio" vmsd field and
> "virtio-blk/inflight" subsection. This configuration will result in the fact
> that inflight vmsd will not be loaded at all (assuming that it met all the
> requirements). Qemu logs will contain trace_vmstate_subsection_load_bad
> (lookup) error for "virtio" vmstate when loading "virtio-blk/inflight"
> subsection.

Thanks for the details.  However it didn't resolve my confusion.  Let me
ask more explicitly.

In this case, VMSTATE_VIRTIO_DEVICE will be a single field in the parent
vmsd of "virtio-blk".  Meanwhile, "virtio-blk/inflight" will be the only
subsection.

Now, if you hit a LOOKUP error of trace_vmstate_subsection_load_bad(), it
means the dst QEMU hits this:

vmstate_subsection_load():
        sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr);
        if (sub_vmsd == NULL) {
            trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(lookup)");
            error_setg(errp, "VM subsection '%s' in '%s' does not exist",
                       idstr, vmsd->name);
            return -ENOENT;
        }

IIUC, when reaching here, the load of VMSTATE_VIRTIO_DEVICE should have
been completed, because we always load fields before subsections.

Meanwhile, when reaching here, "idstr" should be "virtio-blk/inflight",
because that's essentailly the only subsection this parent VMSD has.

vmstate_get_subsection() should try to lookup all subsections matching it,
and it should find it.

I do not yet get why the VMSTATE_VIRTIO_DEVICE can get involved in the
lookup (e.g. it is not on vmsd->subsections), and why it caused failure.

Could you help me to find where I missed?

Thanks,

-- 
Peter Xu


Re: [PATCH v1] vmstate: fix subsection load name check
Posted by Alexandr Moshkov 1 month, 2 weeks ago
On 2/24/26 21:13, Peter Xu wrote:
> On Tue, Feb 24, 2026 at 12:10:20PM +0500, Alexandr Moshkov wrote:
>> On 2/23/26 23:58, Peter Xu wrote:
>>> On Fri, Feb 20, 2026 at 03:26:29PM +0500, Alexandr Moshkov wrote:
>>>> When loading a subset, its name is checked for the parent prefix. The
>>>> following bug may occur here:
>>>>
>>>> Let's say there is a vmstate named "virtio-blk", it has a subsection
>>>> named "virtio-blk/subsection", and it also has another vmstate named
>>>> "virtio" in the fields.
>>>> Then, during the migration, when trying to load this subsection for
>>>> "virtio", the prefix condition will pass for "virtio-blk/subsection" and
>>>> then the migration will break, because this vmstate does not have such a
>>>> subsection.
>>>>
>>>> In other words, if a field inside vmstate1 is set via vmstate2 with a
>>>> name that is a prefix of the parent vmstate, then the field can "steal"
>>>> a subsection belonging to the parent state.
>>>>
>>>> Fix it by checking `/` at the end of idstr.
>>> Checking versus '\' looks reasonable, however I'm still confused on the
>>> example given, and what problem you hit.
>>>
>>> Here, your concern seems to be that vmstate_subsection_load() can
>>> accidentally load a FIELD of the parent VMSD whose name is exactly the name
>>> of the parent VMSD (which will be the prefix of all subsections).
>> Thanks for reply,
>>
>> vmstate_subsection_load() while trying to load a FIELD (whose name is prefix
>> of parent name) of the parent VMSD will try to load parent subsection.
>>
>>> Now my question is, when reaching the line you modified below, it needs to
>>> be prefixed with QEMU_VM_SUBSECTION.  It means the src QEMU is dumping a
>>> subsection rather than a field.  OTOH, when dumping a field, we never dump
>>> any name; I don't think we name FIELD at all..
>>>
>>> Could you share the failure you hit in real life?  That might help to
>>> understand the problem on its own.
>> Here is code example:
>>
>> ```
>>
>> static const VMStateDescription vmstate_vhost_user_virtio_blk_inflight = {
>> .name = "virtio-blk/inflight",
>> .version_id = 2,
>> .needed = vhost_user_blk_inflight_needed,
>> .fields = (const VMStateField[]) {
>> VMSTATE_VHOST_INFLIGHT_REGION(inflight, VHostUserBlk),
>> VMSTATE_END_OF_LIST()
>> }
>> };
>>
>>
>> static const VMStateDescription vmstate_vhost_user_virtio_blk = {
>> .name = "virtio-blk",
>> .minimum_version_id = 2,
>> .version_id = 2,
>> .fields = (VMStateField[]) {
>> VMSTATE_VIRTIO_DEVICE,
>> VMSTATE_END_OF_LIST()
>> },
>> .subsections = (const VMStateDescription * []) {
>> &vmstate_vhost_user_virtio_blk_inflight,
>> NULL
>> }
>> };
>>
>> ```
>>
>> VMSTATE_VIRTIO_DEVICE is
>>
>> ```
>>
>> #define VMSTATE_VIRTIO_DEVICE \
>> { \
>> .name = "virtio", \
>> .info = &virtio_vmstate_info, \
>> .flags = VMS_SINGLE, \
>> }
>> ```
>> So here is "virtio-blk" vmsd that have "virtio" vmsd field and
>> "virtio-blk/inflight" subsection. This configuration will result in the fact
>> that inflight vmsd will not be loaded at all (assuming that it met all the
>> requirements). Qemu logs will contain trace_vmstate_subsection_load_bad
>> (lookup) error for "virtio" vmstate when loading "virtio-blk/inflight"
>> subsection.
> Thanks for the details.  However it didn't resolve my confusion.  Let me
> ask more explicitly.
>
> In this case, VMSTATE_VIRTIO_DEVICE will be a single field in the parent
> vmsd of "virtio-blk".  Meanwhile, "virtio-blk/inflight" will be the only
> subsection.
>
> Now, if you hit a LOOKUP error of trace_vmstate_subsection_load_bad(), it
> means the dst QEMU hits this:
>
> vmstate_subsection_load():
>          sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr);
>          if (sub_vmsd == NULL) {
>              trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(lookup)");
>              error_setg(errp, "VM subsection '%s' in '%s' does not exist",
>                         idstr, vmsd->name);
>              return -ENOENT;
>          }

Yes, and it will return -ENOENT. This causes an error in upper function. 
Error in destination qemu logs will looks like this:

[... load virtio field ...]
vmstate_subsection_load_bad virtio: virtio-blk/inflight/(lookup)
qemu-system-x86_64: Failed to load virtio-blk:virtio
vmstate_load_field_error field "virtio" load failed, ret = -2
qemu-system-x86_64: error while loading state for instance 0x0 of device 
'0000:80:03.0:00.0:00.0/virtio-blk'
qemu-system-x86_64: load of migration failed: No such file or directory

> IIUC, when reaching here, the load of VMSTATE_VIRTIO_DEVICE should have
> been completed, because we always load fields before subsections.
>
> Meanwhile, when reaching here, "idstr" should be "virtio-blk/inflight",
> because that's essentailly the only subsection this parent VMSD has.
>
> vmstate_get_subsection() should try to lookup all subsections matching it,
> and it should find it.
>
> I do not yet get why the VMSTATE_VIRTIO_DEVICE can get involved in the
> lookup (e.g. it is not on vmsd->subsections), and why it caused failure.
>
> Could you help me to find where I missed?
>
> Thanks,
>

Re: [PATCH v1] vmstate: fix subsection load name check
Posted by Peter Xu 1 month, 2 weeks ago
On Wed, Feb 25, 2026 at 11:04:40AM +0500, Alexandr Moshkov wrote:
> 
> On 2/24/26 21:13, Peter Xu wrote:
> > On Tue, Feb 24, 2026 at 12:10:20PM +0500, Alexandr Moshkov wrote:
> > > On 2/23/26 23:58, Peter Xu wrote:
> > > > On Fri, Feb 20, 2026 at 03:26:29PM +0500, Alexandr Moshkov wrote:
> > > > > When loading a subset, its name is checked for the parent prefix. The
> > > > > following bug may occur here:
> > > > > 
> > > > > Let's say there is a vmstate named "virtio-blk", it has a subsection
> > > > > named "virtio-blk/subsection", and it also has another vmstate named
> > > > > "virtio" in the fields.
> > > > > Then, during the migration, when trying to load this subsection for
> > > > > "virtio", the prefix condition will pass for "virtio-blk/subsection" and
> > > > > then the migration will break, because this vmstate does not have such a
> > > > > subsection.
> > > > > 
> > > > > In other words, if a field inside vmstate1 is set via vmstate2 with a
> > > > > name that is a prefix of the parent vmstate, then the field can "steal"
> > > > > a subsection belonging to the parent state.
> > > > > 
> > > > > Fix it by checking `/` at the end of idstr.
> > > > Checking versus '\' looks reasonable, however I'm still confused on the
> > > > example given, and what problem you hit.
> > > > 
> > > > Here, your concern seems to be that vmstate_subsection_load() can
> > > > accidentally load a FIELD of the parent VMSD whose name is exactly the name
> > > > of the parent VMSD (which will be the prefix of all subsections).
> > > Thanks for reply,
> > > 
> > > vmstate_subsection_load() while trying to load a FIELD (whose name is prefix
> > > of parent name) of the parent VMSD will try to load parent subsection.
> > > 
> > > > Now my question is, when reaching the line you modified below, it needs to
> > > > be prefixed with QEMU_VM_SUBSECTION.  It means the src QEMU is dumping a
> > > > subsection rather than a field.  OTOH, when dumping a field, we never dump
> > > > any name; I don't think we name FIELD at all..
> > > > 
> > > > Could you share the failure you hit in real life?  That might help to
> > > > understand the problem on its own.
> > > Here is code example:
> > > 
> > > ```
> > > 
> > > static const VMStateDescription vmstate_vhost_user_virtio_blk_inflight = {
> > > .name = "virtio-blk/inflight",
> > > .version_id = 2,
> > > .needed = vhost_user_blk_inflight_needed,
> > > .fields = (const VMStateField[]) {
> > > VMSTATE_VHOST_INFLIGHT_REGION(inflight, VHostUserBlk),
> > > VMSTATE_END_OF_LIST()
> > > }
> > > };
> > > 
> > > 
> > > static const VMStateDescription vmstate_vhost_user_virtio_blk = {
> > > .name = "virtio-blk",
> > > .minimum_version_id = 2,
> > > .version_id = 2,
> > > .fields = (VMStateField[]) {
> > > VMSTATE_VIRTIO_DEVICE,
> > > VMSTATE_END_OF_LIST()
> > > },
> > > .subsections = (const VMStateDescription * []) {
> > > &vmstate_vhost_user_virtio_blk_inflight,
> > > NULL
> > > }
> > > };
> > > 
> > > ```
> > > 
> > > VMSTATE_VIRTIO_DEVICE is
> > > 
> > > ```
> > > 
> > > #define VMSTATE_VIRTIO_DEVICE \
> > > { \
> > > .name = "virtio", \
> > > .info = &virtio_vmstate_info, \
> > > .flags = VMS_SINGLE, \
> > > }
> > > ```
> > > So here is "virtio-blk" vmsd that have "virtio" vmsd field and
> > > "virtio-blk/inflight" subsection. This configuration will result in the fact
> > > that inflight vmsd will not be loaded at all (assuming that it met all the
> > > requirements). Qemu logs will contain trace_vmstate_subsection_load_bad
> > > (lookup) error for "virtio" vmstate when loading "virtio-blk/inflight"
> > > subsection.
> > Thanks for the details.  However it didn't resolve my confusion.  Let me
> > ask more explicitly.
> > 
> > In this case, VMSTATE_VIRTIO_DEVICE will be a single field in the parent
> > vmsd of "virtio-blk".  Meanwhile, "virtio-blk/inflight" will be the only
> > subsection.
> > 
> > Now, if you hit a LOOKUP error of trace_vmstate_subsection_load_bad(), it
> > means the dst QEMU hits this:
> > 
> > vmstate_subsection_load():
> >          sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr);
> >          if (sub_vmsd == NULL) {
> >              trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(lookup)");
> >              error_setg(errp, "VM subsection '%s' in '%s' does not exist",
> >                         idstr, vmsd->name);
> >              return -ENOENT;
> >          }
> 
> Yes, and it will return -ENOENT. This causes an error in upper function.
> Error in destination qemu logs will looks like this:
> 
> [... load virtio field ...]
> vmstate_subsection_load_bad virtio: virtio-blk/inflight/(lookup)

Yes, this implies vmstate_get_subsection() failure.

> qemu-system-x86_64: Failed to load virtio-blk:virtio

This implies the above failure happened in a nested load of
"virtio-blk:virtio", rather than the top "virtio-blk".

Here virtio_load invokes two vmstate_load_state on vdc->vmsd,
vmstate_virtio.

If the loader side was trying to lookup the "/inflight" subsection within
e.g. vmstate_virtio then it will fail indeed, but I don't understand why it
sees the "/inflight" subsection.  Can you help explain?

Are you using different versions of QEMU on src/dst when testing?  If
they're different (or with different patches applied), please spell them
out.

Or if you could help me to answer what is missing that I stated below it
might also help me to figure out what I missed.  So far I'm still a bit
lost and not yet understand why this patch helps..

Thanks,

> vmstate_load_field_error field "virtio" load failed, ret = -2
> qemu-system-x86_64: error while loading state for instance 0x0 of device
> '0000:80:03.0:00.0:00.0/virtio-blk'
> qemu-system-x86_64: load of migration failed: No such file or directory
> 
> > IIUC, when reaching here, the load of VMSTATE_VIRTIO_DEVICE should have
> > been completed, because we always load fields before subsections.
> > 
> > Meanwhile, when reaching here, "idstr" should be "virtio-blk/inflight",
> > because that's essentailly the only subsection this parent VMSD has.
> > 
> > vmstate_get_subsection() should try to lookup all subsections matching it,
> > and it should find it.
> > 
> > I do not yet get why the VMSTATE_VIRTIO_DEVICE can get involved in the
> > lookup (e.g. it is not on vmsd->subsections), and why it caused failure.
> > 
> > Could you help me to find where I missed?
> > 
> > Thanks,
> > 
> 

-- 
Peter Xu


Re: [PATCH v1] vmstate: fix subsection load name check
Posted by Alexandr Moshkov 1 month, 2 weeks ago
On 2/25/26 22:23, Peter Xu wrote:
> On Wed, Feb 25, 2026 at 11:04:40AM +0500, Alexandr Moshkov wrote:
>> On 2/24/26 21:13, Peter Xu wrote:
>>> On Tue, Feb 24, 2026 at 12:10:20PM +0500, Alexandr Moshkov wrote:
>>>> On 2/23/26 23:58, Peter Xu wrote:
>>>>> On Fri, Feb 20, 2026 at 03:26:29PM +0500, Alexandr Moshkov wrote:
>>>>>> When loading a subset, its name is checked for the parent prefix. The
>>>>>> following bug may occur here:
>>>>>>
>>>>>> Let's say there is a vmstate named "virtio-blk", it has a subsection
>>>>>> named "virtio-blk/subsection", and it also has another vmstate named
>>>>>> "virtio" in the fields.
>>>>>> Then, during the migration, when trying to load this subsection for
>>>>>> "virtio", the prefix condition will pass for "virtio-blk/subsection" and
>>>>>> then the migration will break, because this vmstate does not have such a
>>>>>> subsection.
>>>>>>
>>>>>> In other words, if a field inside vmstate1 is set via vmstate2 with a
>>>>>> name that is a prefix of the parent vmstate, then the field can "steal"
>>>>>> a subsection belonging to the parent state.
>>>>>>
>>>>>> Fix it by checking `/` at the end of idstr.
>>>>> Checking versus '\' looks reasonable, however I'm still confused on the
>>>>> example given, and what problem you hit.
>>>>>
>>>>> Here, your concern seems to be that vmstate_subsection_load() can
>>>>> accidentally load a FIELD of the parent VMSD whose name is exactly the name
>>>>> of the parent VMSD (which will be the prefix of all subsections).
>>>> Thanks for reply,
>>>>
>>>> vmstate_subsection_load() while trying to load a FIELD (whose name is prefix
>>>> of parent name) of the parent VMSD will try to load parent subsection.
>>>>
>>>>> Now my question is, when reaching the line you modified below, it needs to
>>>>> be prefixed with QEMU_VM_SUBSECTION.  It means the src QEMU is dumping a
>>>>> subsection rather than a field.  OTOH, when dumping a field, we never dump
>>>>> any name; I don't think we name FIELD at all..
>>>>>
>>>>> Could you share the failure you hit in real life?  That might help to
>>>>> understand the problem on its own.
>>>> Here is code example:
>>>>
>>>> ```
>>>>
>>>> static const VMStateDescription vmstate_vhost_user_virtio_blk_inflight = {
>>>> .name = "virtio-blk/inflight",
>>>> .version_id = 2,
>>>> .needed = vhost_user_blk_inflight_needed,
>>>> .fields = (const VMStateField[]) {
>>>> VMSTATE_VHOST_INFLIGHT_REGION(inflight, VHostUserBlk),
>>>> VMSTATE_END_OF_LIST()
>>>> }
>>>> };
>>>>
>>>>
>>>> static const VMStateDescription vmstate_vhost_user_virtio_blk = {
>>>> .name = "virtio-blk",
>>>> .minimum_version_id = 2,
>>>> .version_id = 2,
>>>> .fields = (VMStateField[]) {
>>>> VMSTATE_VIRTIO_DEVICE,
>>>> VMSTATE_END_OF_LIST()
>>>> },
>>>> .subsections = (const VMStateDescription * []) {
>>>> &vmstate_vhost_user_virtio_blk_inflight,
>>>> NULL
>>>> }
>>>> };
>>>>
>>>> ```
>>>>
>>>> VMSTATE_VIRTIO_DEVICE is
>>>>
>>>> ```
>>>>
>>>> #define VMSTATE_VIRTIO_DEVICE \
>>>> { \
>>>> .name = "virtio", \
>>>> .info = &virtio_vmstate_info, \
>>>> .flags = VMS_SINGLE, \
>>>> }
>>>> ```
>>>> So here is "virtio-blk" vmsd that have "virtio" vmsd field and
>>>> "virtio-blk/inflight" subsection. This configuration will result in the fact
>>>> that inflight vmsd will not be loaded at all (assuming that it met all the
>>>> requirements). Qemu logs will contain trace_vmstate_subsection_load_bad
>>>> (lookup) error for "virtio" vmstate when loading "virtio-blk/inflight"
>>>> subsection.
>>> Thanks for the details.  However it didn't resolve my confusion.  Let me
>>> ask more explicitly.
>>>
>>> In this case, VMSTATE_VIRTIO_DEVICE will be a single field in the parent
>>> vmsd of "virtio-blk".  Meanwhile, "virtio-blk/inflight" will be the only
>>> subsection.
>>>
>>> Now, if you hit a LOOKUP error of trace_vmstate_subsection_load_bad(), it
>>> means the dst QEMU hits this:
>>>
>>> vmstate_subsection_load():
>>>           sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr);
>>>           if (sub_vmsd == NULL) {
>>>               trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(lookup)");
>>>               error_setg(errp, "VM subsection '%s' in '%s' does not exist",
>>>                          idstr, vmsd->name);
>>>               return -ENOENT;
>>>           }
>> Yes, and it will return -ENOENT. This causes an error in upper function.
>> Error in destination qemu logs will looks like this:
>>
>> [... load virtio field ...]
>> vmstate_subsection_load_bad virtio: virtio-blk/inflight/(lookup)
> Yes, this implies vmstate_get_subsection() failure.
>
>> qemu-system-x86_64: Failed to load virtio-blk:virtio
> This implies the above failure happened in a nested load of
> "virtio-blk:virtio", rather than the top "virtio-blk".
>
> Here virtio_load invokes two vmstate_load_state on vdc->vmsd,
> vmstate_virtio.
>
> If the loader side was trying to lookup the "/inflight" subsection within
> e.g. vmstate_virtio then it will fail indeed, but I don't understand why it
> sees the "/inflight" subsection.  Can you help explain?

Yes, looks like it happens because migration stream for "virtio-blk" 
looks like this:

[virtio-blk header] [virtio-blk fields] [virtio-blk subsections]

"virtio-blk" contains "virtio" field, so migration stream is:

[virtio-blk header] [virtio header] [virtio fields] [virtio subsections] 
[virtio-blk subsections]

And when we load the subsections of the "virtio" device, 
vmstate_subsection_load() uses qemu_peek_byte() to try to figure out if 
this is his subsection. This is where we encounter an error.

Thus, the error occurs due to the fact that vmsd does not know how many 
subsections it has when loading (this does not appear anywhere in the 
migration stream), so it tries to load all the appropriate ones by names.

>
> Are you using different versions of QEMU on src/dst when testing?  If
> they're different (or with different patches applied), please spell them
> out.
>
> Or if you could help me to answer what is missing that I stated below it
> might also help me to figure out what I missed.  So far I'm still a bit
> lost and not yet understand why this patch helps..
>
> Thanks,
>
>> vmstate_load_field_error field "virtio" load failed, ret = -2
>> qemu-system-x86_64: error while loading state for instance 0x0 of device
>> '0000:80:03.0:00.0:00.0/virtio-blk'
>> qemu-system-x86_64: load of migration failed: No such file or directory
>>
>>> IIUC, when reaching here, the load of VMSTATE_VIRTIO_DEVICE should have
>>> been completed, because we always load fields before subsections.
>>>
>>> Meanwhile, when reaching here, "idstr" should be "virtio-blk/inflight",
>>> because that's essentailly the only subsection this parent VMSD has.
>>>
>>> vmstate_get_subsection() should try to lookup all subsections matching it,
>>> and it should find it.
>>>
>>> I do not yet get why the VMSTATE_VIRTIO_DEVICE can get involved in the
>>> lookup (e.g. it is not on vmsd->subsections), and why it caused failure.
>>>
>>> Could you help me to find where I missed?
>>>
>>> Thanks,
>>>
Re: [PATCH v1] vmstate: fix subsection load name check
Posted by Peter Xu 1 month, 2 weeks ago
On Thu, Feb 26, 2026 at 12:47:55PM +0500, Alexandr Moshkov wrote:
> Yes, looks like it happens because migration stream for "virtio-blk" looks
> like this:
> 
> [virtio-blk header] [virtio-blk fields] [virtio-blk subsections]
> 
> "virtio-blk" contains "virtio" field, so migration stream is:
> 
> [virtio-blk header] [virtio header] [virtio fields] [virtio subsections]
> [virtio-blk subsections]
> 
> And when we load the subsections of the "virtio" device,
> vmstate_subsection_load() uses qemu_peek_byte() to try to figure out if this
> is his subsection. This is where we encounter an error.
> 
> Thus, the error occurs due to the fact that vmsd does not know how many
> subsections it has when loading (this does not appear anywhere in the
> migration stream), so it tries to load all the appropriate ones by names.

OK I see, thanks for the details.

I think I missed one important thing, that we don't treat
trace_vmstate_subsection_load_bad("(prefix)") as errors.. instead we
fallback to the upper layer by returning zero.

Could you put most of above reply into the commit message?  I think it may
help people understand the commit.

We could also add one comment right above the extra "/" check in code, to
make sure nobody will accidentally remove it.

Thanks,

-- 
Peter Xu
Re: [PATCH v1] vmstate: fix subsection load name check
Posted by Alexandr Moshkov 1 month, 1 week ago
On 2/26/26 20:20, Peter Xu wrote:
> On Thu, Feb 26, 2026 at 12:47:55PM +0500, Alexandr Moshkov wrote:
>> Yes, looks like it happens because migration stream for "virtio-blk" looks
>> like this:
>>
>> [virtio-blk header] [virtio-blk fields] [virtio-blk subsections]
>>
>> "virtio-blk" contains "virtio" field, so migration stream is:
>>
>> [virtio-blk header] [virtio header] [virtio fields] [virtio subsections]
>> [virtio-blk subsections]
>>
>> And when we load the subsections of the "virtio" device,
>> vmstate_subsection_load() uses qemu_peek_byte() to try to figure out if this
>> is his subsection. This is where we encounter an error.
>>
>> Thus, the error occurs due to the fact that vmsd does not know how many
>> subsections it has when loading (this does not appear anywhere in the
>> migration stream), so it tries to load all the appropriate ones by names.
> OK I see, thanks for the details.
>
> I think I missed one important thing, that we don't treat
> trace_vmstate_subsection_load_bad("(prefix)") as errors.. instead we
> fallback to the upper layer by returning zero.
>
> Could you put most of above reply into the commit message?  I think it may
> help people understand the commit.
>
> We could also add one comment right above the extra "/" check in code, to
> make sure nobody will accidentally remove it.
>
> Thanks,
>
Okay, I'll do that. Thanks!