[PATCH v2] virtio-balloon: always indicate S_DONE when migration fails

David Hildenbrand posted 1 patch 3 years, 9 months ago
Test FreeBSD passed
Test docker-quick@centos7 passed
Test checkpatch passed
Test docker-mingw@fedora passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200629080615.26022-1-david@redhat.com
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, David Hildenbrand <david@redhat.com>
hw/virtio/virtio-balloon.c | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)
[PATCH v2] virtio-balloon: always indicate S_DONE when migration fails
Posted by David Hildenbrand 3 years, 9 months ago
If something goes wrong during precopy, before stopping the VM, we will
never send a S_DONE indication to the VM, resulting in the hinted pages
not getting released to be used by the guest OS (e.g., Linux).

Easy to reproduce:
1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
2. Cancel migration (e.g., HMP "migrate_cancel")
3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
   no free memory left.

While at it, add similar locking to virtio_balloon_free_page_done() as
done in virtio_balloon_free_page_stop. Locking is still weird, but that
has to be sorted out separately.

There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
comments regarding S_DONE handling.

Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 10507b2a43..8a84718490 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -628,8 +628,13 @@ static void virtio_balloon_free_page_done(VirtIOBalloon *s)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-    s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
-    virtio_notify_config(vdev);
+    if (s->free_page_report_status != FREE_PAGE_REPORT_S_DONE) {
+        /* See virtio_balloon_free_page_stop() */
+        qemu_mutex_lock(&s->free_page_lock);
+        s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
+        qemu_mutex_unlock(&s->free_page_lock);
+        virtio_notify_config(vdev);
+    }
 }
 
 static int
@@ -653,17 +658,26 @@ virtio_balloon_free_page_report_notify(NotifierWithReturn *n, void *data)
     case PRECOPY_NOTIFY_SETUP:
         precopy_enable_free_page_optimization();
         break;
-    case PRECOPY_NOTIFY_COMPLETE:
-    case PRECOPY_NOTIFY_CLEANUP:
     case PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC:
         virtio_balloon_free_page_stop(dev);
         break;
     case PRECOPY_NOTIFY_AFTER_BITMAP_SYNC:
         if (vdev->vm_running) {
             virtio_balloon_free_page_start(dev);
-        } else {
-            virtio_balloon_free_page_done(dev);
+            break;
         }
+        /*
+         * Set S_DONE before migrating the vmstate, so the guest will reuse
+         * all hinted pages once running on the destination. Fall through.
+         */
+    case PRECOPY_NOTIFY_CLEANUP:
+        /*
+         * Especially, if something goes wrong during precopy or if migration
+         * is canceled, we have to properly communicate S_DONE to the VM.
+         */
+        virtio_balloon_free_page_done(dev);
+        break;
+    case PRECOPY_NOTIFY_COMPLETE:
         break;
     default:
         virtio_error(vdev, "%s: %d reason unknown", __func__, pnd->reason);
-- 
2.26.2


Re: [PATCH v2] virtio-balloon: always indicate S_DONE when migration fails
Posted by Michael S. Tsirkin 3 years, 9 months ago
On Mon, Jun 29, 2020 at 10:06:15AM +0200, David Hildenbrand wrote:
> If something goes wrong during precopy, before stopping the VM, we will
> never send a S_DONE indication to the VM, resulting in the hinted pages
> not getting released to be used by the guest OS (e.g., Linux).
> 
> Easy to reproduce:
> 1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
> 2. Cancel migration (e.g., HMP "migrate_cancel")
> 3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
>    no free memory left.
> 
> While at it, add similar locking to virtio_balloon_free_page_done() as
> done in virtio_balloon_free_page_stop. Locking is still weird, but that
> has to be sorted out separately.
> 
> There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
> comments regarding S_DONE handling.
> 
> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> Cc: Wei Wang <wei.w.wang@intel.com>
> Cc: Alexander Duyck <alexander.duyck@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

IIUC this is superceded by Alexander's patches right?
If not pls rebase ...

> ---
>  hw/virtio/virtio-balloon.c | 26 ++++++++++++++++++++------
>  1 file changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 10507b2a43..8a84718490 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -628,8 +628,13 @@ static void virtio_balloon_free_page_done(VirtIOBalloon *s)
>  {
>      VirtIODevice *vdev = VIRTIO_DEVICE(s);
>  
> -    s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
> -    virtio_notify_config(vdev);
> +    if (s->free_page_report_status != FREE_PAGE_REPORT_S_DONE) {
> +        /* See virtio_balloon_free_page_stop() */
> +        qemu_mutex_lock(&s->free_page_lock);
> +        s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
> +        qemu_mutex_unlock(&s->free_page_lock);
> +        virtio_notify_config(vdev);
> +    }
>  }
>  
>  static int
> @@ -653,17 +658,26 @@ virtio_balloon_free_page_report_notify(NotifierWithReturn *n, void *data)
>      case PRECOPY_NOTIFY_SETUP:
>          precopy_enable_free_page_optimization();
>          break;
> -    case PRECOPY_NOTIFY_COMPLETE:
> -    case PRECOPY_NOTIFY_CLEANUP:
>      case PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC:
>          virtio_balloon_free_page_stop(dev);
>          break;
>      case PRECOPY_NOTIFY_AFTER_BITMAP_SYNC:
>          if (vdev->vm_running) {
>              virtio_balloon_free_page_start(dev);
> -        } else {
> -            virtio_balloon_free_page_done(dev);
> +            break;
>          }
> +        /*
> +         * Set S_DONE before migrating the vmstate, so the guest will reuse
> +         * all hinted pages once running on the destination. Fall through.
> +         */
> +    case PRECOPY_NOTIFY_CLEANUP:
> +        /*
> +         * Especially, if something goes wrong during precopy or if migration
> +         * is canceled, we have to properly communicate S_DONE to the VM.
> +         */
> +        virtio_balloon_free_page_done(dev);
> +        break;
> +    case PRECOPY_NOTIFY_COMPLETE:
>          break;
>      default:
>          virtio_error(vdev, "%s: %d reason unknown", __func__, pnd->reason);
> -- 
> 2.26.2


Re: [PATCH v2] virtio-balloon: always indicate S_DONE when migration fails
Posted by David Hildenbrand 3 years, 9 months ago
On 22.07.20 14:04, Michael S. Tsirkin wrote:
> On Mon, Jun 29, 2020 at 10:06:15AM +0200, David Hildenbrand wrote:
>> If something goes wrong during precopy, before stopping the VM, we will
>> never send a S_DONE indication to the VM, resulting in the hinted pages
>> not getting released to be used by the guest OS (e.g., Linux).
>>
>> Easy to reproduce:
>> 1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
>> 2. Cancel migration (e.g., HMP "migrate_cancel")
>> 3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
>>    no free memory left.
>>
>> While at it, add similar locking to virtio_balloon_free_page_done() as
>> done in virtio_balloon_free_page_stop. Locking is still weird, but that
>> has to be sorted out separately.
>>
>> There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
>> comments regarding S_DONE handling.
>>
>> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
>> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>> Cc: Wei Wang <wei.w.wang@intel.com>
>> Cc: Alexander Duyck <alexander.duyck@gmail.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> IIUC this is superceded by Alexander's patches right?

Not that I know ... @Alex?

> If not pls rebase ...
> 



-- 
Thanks,

David / dhildenb


Re: [PATCH v2] virtio-balloon: always indicate S_DONE when migration fails
Posted by David Hildenbrand 3 years, 9 months ago
On 22.07.20 14:05, David Hildenbrand wrote:
> On 22.07.20 14:04, Michael S. Tsirkin wrote:
>> On Mon, Jun 29, 2020 at 10:06:15AM +0200, David Hildenbrand wrote:
>>> If something goes wrong during precopy, before stopping the VM, we will
>>> never send a S_DONE indication to the VM, resulting in the hinted pages
>>> not getting released to be used by the guest OS (e.g., Linux).
>>>
>>> Easy to reproduce:
>>> 1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
>>> 2. Cancel migration (e.g., HMP "migrate_cancel")
>>> 3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
>>>    no free memory left.
>>>
>>> While at it, add similar locking to virtio_balloon_free_page_done() as
>>> done in virtio_balloon_free_page_stop. Locking is still weird, but that
>>> has to be sorted out separately.
>>>
>>> There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
>>> comments regarding S_DONE handling.
>>>
>>> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
>>> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>>> Cc: Wei Wang <wei.w.wang@intel.com>
>>> Cc: Alexander Duyck <alexander.duyck@gmail.com>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>
>> IIUC this is superceded by Alexander's patches right?
> 
> Not that I know ... @Alex?
> 

Okay, I'm confused, that patch is already upstream (via your tree)?

dd8eeb9671fc ("virtio-balloon: always indicate S_DONE when migration fails")

Did you stumble over this mail by mistake again?

-- 
Thanks,

David / dhildenb


Re: [PATCH v2] virtio-balloon: always indicate S_DONE when migration fails
Posted by Alexander Duyck 3 years, 9 months ago
On Wed, Jul 22, 2020 at 5:12 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 22.07.20 14:05, David Hildenbrand wrote:
> > On 22.07.20 14:04, Michael S. Tsirkin wrote:
> >> On Mon, Jun 29, 2020 at 10:06:15AM +0200, David Hildenbrand wrote:
> >>> If something goes wrong during precopy, before stopping the VM, we will
> >>> never send a S_DONE indication to the VM, resulting in the hinted pages
> >>> not getting released to be used by the guest OS (e.g., Linux).
> >>>
> >>> Easy to reproduce:
> >>> 1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
> >>> 2. Cancel migration (e.g., HMP "migrate_cancel")
> >>> 3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
> >>>    no free memory left.
> >>>
> >>> While at it, add similar locking to virtio_balloon_free_page_done() as
> >>> done in virtio_balloon_free_page_stop. Locking is still weird, but that
> >>> has to be sorted out separately.
> >>>
> >>> There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
> >>> comments regarding S_DONE handling.
> >>>
> >>> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
> >>> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >>> Cc: Wei Wang <wei.w.wang@intel.com>
> >>> Cc: Alexander Duyck <alexander.duyck@gmail.com>
> >>> Signed-off-by: David Hildenbrand <david@redhat.com>
> >>
> >> IIUC this is superceded by Alexander's patches right?
> >
> > Not that I know ... @Alex?
> >
>
> Okay, I'm confused, that patch is already upstream (via your tree)?
>
> dd8eeb9671fc ("virtio-balloon: always indicate S_DONE when migration fails")
>
> Did you stumble over this mail by mistake again?

I agree. David's patches should be upstream, and if they aren't they
should be applied before mine. Mine are meant to be a follow-on as I
end the set with the "report"->"hint" rename for several fields. The
idea is the fixes go first and the rename comes last.

Thanks.

- Alex

Re: [PATCH v2] virtio-balloon: always indicate S_DONE when migration fails
Posted by Michael S. Tsirkin 3 years, 9 months ago
On Wed, Jul 22, 2020 at 02:11:52PM +0200, David Hildenbrand wrote:
> On 22.07.20 14:05, David Hildenbrand wrote:
> > On 22.07.20 14:04, Michael S. Tsirkin wrote:
> >> On Mon, Jun 29, 2020 at 10:06:15AM +0200, David Hildenbrand wrote:
> >>> If something goes wrong during precopy, before stopping the VM, we will
> >>> never send a S_DONE indication to the VM, resulting in the hinted pages
> >>> not getting released to be used by the guest OS (e.g., Linux).
> >>>
> >>> Easy to reproduce:
> >>> 1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
> >>> 2. Cancel migration (e.g., HMP "migrate_cancel")
> >>> 3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
> >>>    no free memory left.
> >>>
> >>> While at it, add similar locking to virtio_balloon_free_page_done() as
> >>> done in virtio_balloon_free_page_stop. Locking is still weird, but that
> >>> has to be sorted out separately.
> >>>
> >>> There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
> >>> comments regarding S_DONE handling.
> >>>
> >>> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
> >>> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >>> Cc: Wei Wang <wei.w.wang@intel.com>
> >>> Cc: Alexander Duyck <alexander.duyck@gmail.com>
> >>> Signed-off-by: David Hildenbrand <david@redhat.com>
> >>
> >> IIUC this is superceded by Alexander's patches right?
> > 
> > Not that I know ... @Alex?
> > 
> 
> Okay, I'm confused, that patch is already upstream (via your tree)?
> 
> dd8eeb9671fc ("virtio-balloon: always indicate S_DONE when migration fails")
> 
> Did you stumble over this mail by mistake again?
> 
> -- 
> Thanks,
> 
> David / dhildenb

Oh. I guess that's what happened. I saw the code in the tree and thought
it came in from Alex's patch.
Sorry about the noise.

-- 
MST


Re: [PATCH v2] virtio-balloon: always indicate S_DONE when migration fails
Posted by Michael S. Tsirkin 3 years, 9 months ago
On Wed, Jul 22, 2020 at 02:05:19PM +0200, David Hildenbrand wrote:
> On 22.07.20 14:04, Michael S. Tsirkin wrote:
> > On Mon, Jun 29, 2020 at 10:06:15AM +0200, David Hildenbrand wrote:
> >> If something goes wrong during precopy, before stopping the VM, we will
> >> never send a S_DONE indication to the VM, resulting in the hinted pages
> >> not getting released to be used by the guest OS (e.g., Linux).
> >>
> >> Easy to reproduce:
> >> 1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
> >> 2. Cancel migration (e.g., HMP "migrate_cancel")
> >> 3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
> >>    no free memory left.
> >>
> >> While at it, add similar locking to virtio_balloon_free_page_done() as
> >> done in virtio_balloon_free_page_stop. Locking is still weird, but that
> >> has to be sorted out separately.
> >>
> >> There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
> >> comments regarding S_DONE handling.
> >>
> >> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
> >> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >> Cc: Wei Wang <wei.w.wang@intel.com>
> >> Cc: Alexander Duyck <alexander.duyck@gmail.com>
> >> Signed-off-by: David Hildenbrand <david@redhat.com>
> > 
> > IIUC this is superceded by Alexander's patches right?
> 
> Not that I know ... @Alex?
> 
> > If not pls rebase ...
> > 

OK then I guess I was confused. This is older so I guess I should
have applied this and asked Alex to rebase his patches, but I did the
reverse.., Sorry about that. Could you rebase on top of
the pci tree pls?


Thanks and sorry for messing up.

> 
> 
> -- 
> Thanks,
> 
> David / dhildenb