[RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly

Shenming Lu posted 3 patches 5 years, 2 months ago
Maintainers: Alex Williamson <alex.williamson@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
There is a newer version of this series
[RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly
Posted by Shenming Lu 5 years, 2 months ago
In the VFIO VM state change handler, VFIO devices are transitioned
in the _SAVING state, which should keep them from sending interrupts.
Then we can save the pending states of all interrupts in the GIC VM
state change handler (on ARM).

So we have to set the priority of the VFIO VM state change handler
explicitly (like virtio devices) to ensure it is called before the
GIC's in saving.

Signed-off-by: Shenming Lu <lushenming@huawei.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
---
 hw/vfio/migration.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 3b9de1353a..97ea82b100 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
     register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
                          vbasedev);
 
-    migration->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
+    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
+                                                           vfio_vmstate_change,
                                                            vbasedev);
     migration->migration_state.notify = vfio_migration_state_notifier;
     add_migration_state_change_notifier(&migration->migration_state);
-- 
2.19.1


Re: [RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly
Posted by Cornelia Huck 5 years, 2 months ago
On Wed, 9 Dec 2020 16:09:18 +0800
Shenming Lu <lushenming@huawei.com> wrote:

> In the VFIO VM state change handler, VFIO devices are transitioned
> in the _SAVING state, which should keep them from sending interrupts.
> Then we can save the pending states of all interrupts in the GIC VM
> state change handler (on ARM).
> 
> So we have to set the priority of the VFIO VM state change handler
> explicitly (like virtio devices) to ensure it is called before the
> GIC's in saving.

What this patch does is to make the priority of the vfio migration
state change handler depending on the position in the qdev tree. As all
state change handlers with no explicit priority are added at priority
0, this will make sure that this handler runs before (save) resp. after
(restore) nearly all other handlers, which will address your issue here
(and possibly similar ones).

So, this patch seems fine for now, but I'm wondering whether we need to
think more about priorities for handlers in general, and if there are
more hidden dependencies lurking in there.

> 
> Signed-off-by: Shenming Lu <lushenming@huawei.com>
> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
> ---
>  hw/vfio/migration.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 3b9de1353a..97ea82b100 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
>      register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
>                           vbasedev);
>  
> -    migration->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
> +    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
> +                                                           vfio_vmstate_change,
>                                                             vbasedev);
>      migration->migration_state.notify = vfio_migration_state_notifier;
>      add_migration_state_change_notifier(&migration->migration_state);


Re: [RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly
Posted by Shenming Lu 5 years, 2 months ago
On 2020/12/9 20:45, Cornelia Huck wrote:
> On Wed, 9 Dec 2020 16:09:18 +0800
> Shenming Lu <lushenming@huawei.com> wrote:
> 
>> In the VFIO VM state change handler, VFIO devices are transitioned
>> in the _SAVING state, which should keep them from sending interrupts.
>> Then we can save the pending states of all interrupts in the GIC VM
>> state change handler (on ARM).
>>
>> So we have to set the priority of the VFIO VM state change handler
>> explicitly (like virtio devices) to ensure it is called before the
>> GIC's in saving.
> 
> What this patch does is to make the priority of the vfio migration
> state change handler depending on the position in the qdev tree. As all
> state change handlers with no explicit priority are added at priority
> 0, this will make sure that this handler runs before (save) resp. after
> (restore) nearly all other handlers, which will address your issue here
> (and possibly similar ones).
> 
> So, this patch seems fine for now, but I'm wondering whether we need to
> think more about priorities for handlers in general, and if there are
> more hidden dependencies lurking in there.

As far as I know, as for the migration of interrupt, on x86 the sync from
the PIR field to the Virtual-APIC page for posted interrupts (in
KVM_GET_LAPIC ioctl) is after the pause of VFIO devices, which is fine.
Not sure about others...

Thanks,
Shenming

> 
>>
>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
>> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
>> ---
>>  hw/vfio/migration.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 3b9de1353a..97ea82b100 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
>>      register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
>>                           vbasedev);
>>  
>> -    migration->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
>> +    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
>> +                                                           vfio_vmstate_change,
>>                                                             vbasedev);
>>      migration->migration_state.notify = vfio_migration_state_notifier;
>>      add_migration_state_change_notifier(&migration->migration_state);
> 
> .
> 

Re: [RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly
Posted by Cornelia Huck 5 years, 2 months ago
On Thu, 10 Dec 2020 11:11:17 +0800
Shenming Lu <lushenming@huawei.com> wrote:

> On 2020/12/9 20:45, Cornelia Huck wrote:
> > On Wed, 9 Dec 2020 16:09:18 +0800
> > Shenming Lu <lushenming@huawei.com> wrote:
> >   
> >> In the VFIO VM state change handler, VFIO devices are transitioned
> >> in the _SAVING state, which should keep them from sending interrupts.
> >> Then we can save the pending states of all interrupts in the GIC VM
> >> state change handler (on ARM).
> >>
> >> So we have to set the priority of the VFIO VM state change handler
> >> explicitly (like virtio devices) to ensure it is called before the
> >> GIC's in saving.  
> > 
> > What this patch does is to make the priority of the vfio migration
> > state change handler depending on the position in the qdev tree. As all
> > state change handlers with no explicit priority are added at priority
> > 0, this will make sure that this handler runs before (save) resp. after
> > (restore) nearly all other handlers, which will address your issue here
> > (and possibly similar ones).
> > 
> > So, this patch seems fine for now, but I'm wondering whether we need to
> > think more about priorities for handlers in general, and if there are
> > more hidden dependencies lurking in there.  
> 
> As far as I know, as for the migration of interrupt, on x86 the sync from
> the PIR field to the Virtual-APIC page for posted interrupts (in
> KVM_GET_LAPIC ioctl) is after the pause of VFIO devices, which is fine.
> Not sure about others...

Interrupt handling had been my first guess, but it seems most interrupt
controllers don't use a state change handler anyway.

"handlers in general" actually also referred to use cases outside of
vfio migration, so out of scope of this patch...

> 
> Thanks,
> Shenming
> 
> >   
> >>
> >> Signed-off-by: Shenming Lu <lushenming@huawei.com>
> >> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
> >> ---
> >>  hw/vfio/migration.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> >> index 3b9de1353a..97ea82b100 100644
> >> --- a/hw/vfio/migration.c
> >> +++ b/hw/vfio/migration.c
> >> @@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
> >>      register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
> >>                           vbasedev);
> >>  
> >> -    migration->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
> >> +    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
> >> +                                                           vfio_vmstate_change,
> >>                                                             vbasedev);
> >>      migration->migration_state.notify = vfio_migration_state_notifier;
> >>      add_migration_state_change_notifier(&migration->migration_state);  
> > 
> > .
> >   
> 

In any case,

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


Re: [RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly
Posted by Alex Williamson 5 years ago
On Wed, 9 Dec 2020 16:09:18 +0800
Shenming Lu <lushenming@huawei.com> wrote:

> In the VFIO VM state change handler, VFIO devices are transitioned
> in the _SAVING state, which should keep them from sending interrupts.

Is this comment accurate?  It's my expectation that _SAVING has no
bearing on a device generating interrupts.  Interrupt generation must
be allowed to continue so long as the device is _RUNNING.  Thanks,

Alex

> Then we can save the pending states of all interrupts in the GIC VM
> state change handler (on ARM).
> 
> So we have to set the priority of the VFIO VM state change handler
> explicitly (like virtio devices) to ensure it is called before the
> GIC's in saving.
> 
> Signed-off-by: Shenming Lu <lushenming@huawei.com>
> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
> ---
>  hw/vfio/migration.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 3b9de1353a..97ea82b100 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
>      register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
>                           vbasedev);
>  
> -    migration->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
> +    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
> +                                                           vfio_vmstate_change,
>                                                             vbasedev);
>      migration->migration_state.notify = vfio_migration_state_notifier;
>      add_migration_state_change_notifier(&migration->migration_state);


Re: [RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly
Posted by Shenming Lu 5 years ago
On 2021/1/27 5:36, Alex Williamson wrote:
> On Wed, 9 Dec 2020 16:09:18 +0800
> Shenming Lu <lushenming@huawei.com> wrote:
> 
>> In the VFIO VM state change handler, VFIO devices are transitioned
>> in the _SAVING state, which should keep them from sending interrupts.
> 
> Is this comment accurate?  It's my expectation that _SAVING has no
> bearing on a device generating interrupts.  Interrupt generation must
> be allowed to continue so long as the device is _RUNNING.  Thanks,
> 

To be more accurate, the _RUNNING bit in device_state is cleared in the
VFIO VM state change handler when stopping the VM. And if the device continues
to send interrupts after this, how can we save the states of device interrupts
in the stop-and-copy phase?...

Thanks,
Shenming

> Alex
> 
>> Then we can save the pending states of all interrupts in the GIC VM
>> state change handler (on ARM).
>>
>> So we have to set the priority of the VFIO VM state change handler
>> explicitly (like virtio devices) to ensure it is called before the
>> GIC's in saving.
>>
>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
>> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
>> ---
>>  hw/vfio/migration.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 3b9de1353a..97ea82b100 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
>>      register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
>>                           vbasedev);
>>  
>> -    migration->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
>> +    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
>> +                                                           vfio_vmstate_change,
>>                                                             vbasedev);
>>      migration->migration_state.notify = vfio_migration_state_notifier;
>>      add_migration_state_change_notifier(&migration->migration_state);
> 
> .
> 

Re: [RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly
Posted by Alex Williamson 5 years ago
On Wed, 27 Jan 2021 19:20:06 +0800
Shenming Lu <lushenming@huawei.com> wrote:

> On 2021/1/27 5:36, Alex Williamson wrote:
> > On Wed, 9 Dec 2020 16:09:18 +0800
> > Shenming Lu <lushenming@huawei.com> wrote:
> >   
> >> In the VFIO VM state change handler, VFIO devices are transitioned
> >> in the _SAVING state, which should keep them from sending interrupts.  
> > 
> > Is this comment accurate?  It's my expectation that _SAVING has no
> > bearing on a device generating interrupts.  Interrupt generation must
> > be allowed to continue so long as the device is _RUNNING.  Thanks,
> >   
> 
> To be more accurate, the _RUNNING bit in device_state is cleared in the
> VFIO VM state change handler when stopping the VM. And if the device continues
> to send interrupts after this, how can we save the states of device interrupts
> in the stop-and-copy phase?...

Exactly, it's clearing the _RUNNING bit that makes the device stop,
including no longer generating interrupts.  Perhaps I incorrectly
inferred "_SAVING state" as referring to the _SAVING bit when you
actually intended:

   *  +------- _RESUMING
   *  |+------ _SAVING
   *  ||+----- _RUNNING
   *  |||
   *  000b => Device Stopped, not saving or resuming
   *  001b => Device running, which is the default state
-> *  010b => Stop the device & save the device state, stop-and-copy state

ie. the full state when only _SAVING is set.

Could we make the comment more clear to avoid this confusion?  Thanks,

Alex

> >> Then we can save the pending states of all interrupts in the GIC VM
> >> state change handler (on ARM).
> >>
> >> So we have to set the priority of the VFIO VM state change handler
> >> explicitly (like virtio devices) to ensure it is called before the
> >> GIC's in saving.
> >>
> >> Signed-off-by: Shenming Lu <lushenming@huawei.com>
> >> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
> >> ---
> >>  hw/vfio/migration.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> >> index 3b9de1353a..97ea82b100 100644
> >> --- a/hw/vfio/migration.c
> >> +++ b/hw/vfio/migration.c
> >> @@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
> >>      register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
> >>                           vbasedev);
> >>  
> >> -    migration->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
> >> +    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
> >> +                                                           vfio_vmstate_change,
> >>                                                             vbasedev);
> >>      migration->migration_state.notify = vfio_migration_state_notifier;
> >>      add_migration_state_change_notifier(&migration->migration_state);  
> > 
> > .
> >   
> 


Re: [RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly
Posted by Shenming Lu 5 years ago
On 2021/1/27 22:20, Alex Williamson wrote:
> On Wed, 27 Jan 2021 19:20:06 +0800
> Shenming Lu <lushenming@huawei.com> wrote:
> 
>> On 2021/1/27 5:36, Alex Williamson wrote:
>>> On Wed, 9 Dec 2020 16:09:18 +0800
>>> Shenming Lu <lushenming@huawei.com> wrote:
>>>   
>>>> In the VFIO VM state change handler, VFIO devices are transitioned
>>>> in the _SAVING state, which should keep them from sending interrupts.  
>>>
>>> Is this comment accurate?  It's my expectation that _SAVING has no
>>> bearing on a device generating interrupts.  Interrupt generation must
>>> be allowed to continue so long as the device is _RUNNING.  Thanks,
>>>   
>>
>> To be more accurate, the _RUNNING bit in device_state is cleared in the
>> VFIO VM state change handler when stopping the VM. And if the device continues
>> to send interrupts after this, how can we save the states of device interrupts
>> in the stop-and-copy phase?...
> 
> Exactly, it's clearing the _RUNNING bit that makes the device stop,
> including no longer generating interrupts.  Perhaps I incorrectly
> inferred "_SAVING state" as referring to the _SAVING bit when you
> actually intended:
> 
>    *  +------- _RESUMING
>    *  |+------ _SAVING
>    *  ||+----- _RUNNING
>    *  |||
>    *  000b => Device Stopped, not saving or resuming
>    *  001b => Device running, which is the default state
> -> *  010b => Stop the device & save the device state, stop-and-copy state
> 
> ie. the full state when only _SAVING is set.
> 
> Could we make the comment more clear to avoid this confusion?  Thanks,
> 

OK, sorry for the confusion. I will modify the comment to:

In the VFIO VM state change handler when stopping the VM, the _RUNNING bit
in device_state is cleared which makes the VFIO device stop, including
no longer generating interrupts.

Thanks,
Shenming

> Alex
> 
>>>> Then we can save the pending states of all interrupts in the GIC VM
>>>> state change handler (on ARM).
>>>>
>>>> So we have to set the priority of the VFIO VM state change handler
>>>> explicitly (like virtio devices) to ensure it is called before the
>>>> GIC's in saving.
>>>>
>>>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
>>>> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
>>>> ---
>>>>  hw/vfio/migration.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>>> index 3b9de1353a..97ea82b100 100644
>>>> --- a/hw/vfio/migration.c
>>>> +++ b/hw/vfio/migration.c
>>>> @@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
>>>>      register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
>>>>                           vbasedev);
>>>>  
>>>> -    migration->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
>>>> +    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
>>>> +                                                           vfio_vmstate_change,
>>>>                                                             vbasedev);
>>>>      migration->migration_state.notify = vfio_migration_state_notifier;
>>>>      add_migration_state_change_notifier(&migration->migration_state);  
>>>
>>> .
>>>   
>>
> 
> .
>