[v2] virtio-balloon: free page hint reporting support

[Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support

Posted by Wei Wang 8 years ago

This is the deivce part implementation to add a new feature,
VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
receives the guest free page hints from the driver and clears the
corresponding bits in the dirty bitmap, so that those free pages are
not transferred by the migration thread to the destination.

Please see the driver patch link for test results:
https://lkml.org/lkml/2018/2/4/60

ChangeLog:
v1->v2: 
    1) virtio-balloon
        - use subsections to save free_page_report_cmd_id;
        - poll the free page vq after sending a cmd id to the driver;
        - change the free page vq size to VIRTQUEUE_MAX_SIZE;
        - virtio_balloon_poll_free_page_hints: handle the corner case
          that the free page block reported from the driver may cross
          the RAMBlock boundary.
    2) migration/ram.c
        - use balloon_free_page_poll to start the optimization

Wei Wang (3):
  virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
  migration: use the free page reporting feature from balloon
  virtio-balloon: add a timer to limit the free page report waiting time

 balloon.c                                       |  39 ++--
 hw/virtio/virtio-balloon.c                      | 227 ++++++++++++++++++++++--
 hw/virtio/virtio-pci.c                          |   3 +
 include/hw/virtio/virtio-balloon.h              |  15 +-
 include/migration/misc.h                        |   3 +
 include/standard-headers/linux/virtio_balloon.h |   7 +
 include/sysemu/balloon.h                        |  12 +-
 migration/ram.c                                 |  34 +++-
 8 files changed, 307 insertions(+), 33 deletions(-)

-- 
1.8.3.1

Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support

Posted by Michael S. Tsirkin 8 years ago

On Tue, Feb 06, 2018 at 07:08:16PM +0800, Wei Wang wrote:
> This is the deivce part implementation to add a new feature,
> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> receives the guest free page hints from the driver and clears the
> corresponding bits in the dirty bitmap, so that those free pages are
> not transferred by the migration thread to the destination.
> 
> Please see the driver patch link for test results:
> https://lkml.org/lkml/2018/2/4/60
> 
> ChangeLog:
> v1->v2: 
>     1) virtio-balloon
>         - use subsections to save free_page_report_cmd_id;
>         - poll the free page vq after sending a cmd id to the driver;
>         - change the free page vq size to VIRTQUEUE_MAX_SIZE;
>         - virtio_balloon_poll_free_page_hints: handle the corner case
>           that the free page block reported from the driver may cross
>           the RAMBlock boundary.
>     2) migration/ram.c
>         - use balloon_free_page_poll to start the optimization
> 
> Wei Wang (3):
>   virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
>   migration: use the free page reporting feature from balloon
>   virtio-balloon: add a timer to limit the free page report waiting time

This feature needs in-tree documentation about possible ways to use it,
tradeoffs involved etc.


>  balloon.c                                       |  39 ++--
>  hw/virtio/virtio-balloon.c                      | 227 ++++++++++++++++++++++--
>  hw/virtio/virtio-pci.c                          |   3 +
>  include/hw/virtio/virtio-balloon.h              |  15 +-
>  include/migration/misc.h                        |   3 +
>  include/standard-headers/linux/virtio_balloon.h |   7 +
>  include/sysemu/balloon.h                        |  12 +-
>  migration/ram.c                                 |  34 +++-
>  8 files changed, 307 insertions(+), 33 deletions(-)
> 
> -- 
> 1.8.3.1

Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support

Posted by Wei Wang 8 years ago

On 02/07/2018 08:02 AM, Michael S. Tsirkin wrote:
> On Tue, Feb 06, 2018 at 07:08:16PM +0800, Wei Wang wrote:
>> This is the deivce part implementation to add a new feature,
>> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
>> receives the guest free page hints from the driver and clears the
>> corresponding bits in the dirty bitmap, so that those free pages are
>> not transferred by the migration thread to the destination.
>>
>> Please see the driver patch link for test results:
>> https://lkml.org/lkml/2018/2/4/60
>>
>> ChangeLog:
>> v1->v2:
>>      1) virtio-balloon
>>          - use subsections to save free_page_report_cmd_id;
>>          - poll the free page vq after sending a cmd id to the driver;
>>          - change the free page vq size to VIRTQUEUE_MAX_SIZE;
>>          - virtio_balloon_poll_free_page_hints: handle the corner case
>>            that the free page block reported from the driver may cross
>>            the RAMBlock boundary.
>>      2) migration/ram.c
>>          - use balloon_free_page_poll to start the optimization
>>
>> Wei Wang (3):
>>    virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
>>    migration: use the free page reporting feature from balloon
>>    virtio-balloon: add a timer to limit the free page report waiting time
> This feature needs in-tree documentation about possible ways to use it,
> tradeoffs involved etc.

OK. I plan to add the documentation in later versions after we mostly 
finalize the QEMU part design.

Best,
Wei

Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support

Posted by Dr. David Alan Gilbert 8 years ago

* Wei Wang (wei.w.wang@intel.com) wrote:
> This is the deivce part implementation to add a new feature,
> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> receives the guest free page hints from the driver and clears the
> corresponding bits in the dirty bitmap, so that those free pages are
> not transferred by the migration thread to the destination.
> 
> Please see the driver patch link for test results:
> https://lkml.org/lkml/2018/2/4/60

Hi Wei,
   I'll look at the code a bit more - but first some more basic
questions on that lkml post:

    a) The idle guest time thing is a nice result; can you just state
       what the host was, speed of connection, and what other options
       you were using?

    b) The workload test, the one with the kernel compile; you list
       the kernel compile time but don't mention any changes in the
       migration times of the ping-pong; can you give those times as
       well?

    c) What's your real workload that this is aimed at?
       Is it really for people migrating idle VMs - or do you have some
       NFV application in mind, if so why not include a figure for
       those?

Thanks,

Dave

> ChangeLog:
> v1->v2: 
>     1) virtio-balloon
>         - use subsections to save free_page_report_cmd_id;
>         - poll the free page vq after sending a cmd id to the driver;
>         - change the free page vq size to VIRTQUEUE_MAX_SIZE;
>         - virtio_balloon_poll_free_page_hints: handle the corner case
>           that the free page block reported from the driver may cross
>           the RAMBlock boundary.
>     2) migration/ram.c
>         - use balloon_free_page_poll to start the optimization
> 
> Wei Wang (3):
>   virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
>   migration: use the free page reporting feature from balloon
>   virtio-balloon: add a timer to limit the free page report waiting time
> 
>  balloon.c                                       |  39 ++--
>  hw/virtio/virtio-balloon.c                      | 227 ++++++++++++++++++++++--
>  hw/virtio/virtio-pci.c                          |   3 +
>  include/hw/virtio/virtio-balloon.h              |  15 +-
>  include/migration/misc.h                        |   3 +
>  include/standard-headers/linux/virtio_balloon.h |   7 +
>  include/sysemu/balloon.h                        |  12 +-
>  migration/ram.c                                 |  34 +++-
>  8 files changed, 307 insertions(+), 33 deletions(-)
> 
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support

Posted by Wei Wang 8 years ago

On 02/09/2018 04:15 AM, Dr. David Alan Gilbert wrote:
> * Wei Wang (wei.w.wang@intel.com) wrote:
>> This is the deivce part implementation to add a new feature,
>> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
>> receives the guest free page hints from the driver and clears the
>> corresponding bits in the dirty bitmap, so that those free pages are
>> not transferred by the migration thread to the destination.
>>
>> Please see the driver patch link for test results:
>> https://lkml.org/lkml/2018/2/4/60
> Hi Wei,
>     I'll look at the code a bit more - but first some more basic
> questions on that lkml post:
>
>      a) The idle guest time thing is a nice result; can you just state
>         what the host was, speed of connection, and what other options
>         you were using?
>
>      b) The workload test, the one with the kernel compile; you list
>         the kernel compile time but don't mention any changes in the
>         migration times of the ping-pong; can you give those times as
>         well?
>
>      c) What's your real workload that this is aimed at?
>         Is it really for people migrating idle VMs - or do you have some
>         NFV application in mind, if so why not include a figure for
>         those?
>

Hi Dave,

Thanks for joining the review. Please see below info.

a) Environment info
     - Host:
         - Physical CPU: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
         - kernel: 3.10.0

     - Guest:
         - kernel: 4.15.0
         - QEMU setup: -cpu host -M pc -smp 4,threads=1,sockets=1 -m 8G 
--mem-prealloc -realtime mlock=on -balloon virtio,free-page-hint=true

     - Migration setup:
         - migrate_set_speed 0
         - migrate_set_downtime 0.01  (10ms)

b) Michael asked the same question on the kernel patches, I'll reply 
there with you cc-ed, so that kernel maintainers can also see it. Btw, 
do you have any other workloads you would suggest to have a try?

c) This feature is requested by many customers (e.g. general cloud 
vendors). It's for general use cases. As long as the guest has free 
memory, it will benefit from this optimization when doing migration. 
It's not specific for NFV usages, but for sure NFV will also benefit 
from this feature if we think about service chaining, where multiple VMs 
need to co-work with each other. In that case, migrating one VM will 
just break the working model, which means we will need to migrate all 
the VMs. A shorter migration time will be very helpful.

Best,
Wei

Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support

Posted by Dr. David Alan Gilbert 8 years ago

* Wei Wang (wei.w.wang@intel.com) wrote:
> On 02/09/2018 04:15 AM, Dr. David Alan Gilbert wrote:
> > * Wei Wang (wei.w.wang@intel.com) wrote:
> > > This is the deivce part implementation to add a new feature,
> > > VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> > > receives the guest free page hints from the driver and clears the
> > > corresponding bits in the dirty bitmap, so that those free pages are
> > > not transferred by the migration thread to the destination.
> > > 
> > > Please see the driver patch link for test results:
> > > https://lkml.org/lkml/2018/2/4/60
> > Hi Wei,
> >     I'll look at the code a bit more - but first some more basic
> > questions on that lkml post:
> > 
> >      a) The idle guest time thing is a nice result; can you just state
> >         what the host was, speed of connection, and what other options
> >         you were using?
> > 
> >      b) The workload test, the one with the kernel compile; you list
> >         the kernel compile time but don't mention any changes in the
> >         migration times of the ping-pong; can you give those times as
> >         well?
> > 
> >      c) What's your real workload that this is aimed at?
> >         Is it really for people migrating idle VMs - or do you have some
> >         NFV application in mind, if so why not include a figure for
> >         those?
> > 
> 
> Hi Dave,
> 
> Thanks for joining the review. Please see below info.
> 
> a) Environment info
>     - Host:
>         - Physical CPU: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
>         - kernel: 3.10.0
> 
>     - Guest:
>         - kernel: 4.15.0
>         - QEMU setup: -cpu host -M pc -smp 4,threads=1,sockets=1 -m 8G
> --mem-prealloc -realtime mlock=on -balloon virtio,free-page-hint=true
> 
>     - Migration setup:
>         - migrate_set_speed 0
>         - migrate_set_downtime 0.01  (10ms)

That's an unusually low downtime (and I'm not sure what setting the
speed to 0 does!).

> b) Michael asked the same question on the kernel patches, I'll reply there
> with you cc-ed, so that kernel maintainers can also see it. Btw, do you have
> any other workloads you would suggest to have a try?

No, not really; I guess it's best for VMs that are either idle or have
lots of spare RAM.

> c) This feature is requested by many customers (e.g. general cloud vendors).
> It's for general use cases. As long as the guest has free memory, it will
> benefit from this optimization when doing migration. It's not specific for
> NFV usages, but for sure NFV will also benefit from this feature if we think
> about service chaining, where multiple VMs need to co-work with each other.
> In that case, migrating one VM will just break the working model, which
> means we will need to migrate all the VMs. A shorter migration time will be
> very helpful.

I thought of NFV because their VMs tend to have lots of extra RAM but
most seems unused most of the time.

Dave

> 
> Best,
> Wei
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support

Posted by Wei Wang 7 years, 11 months ago

On 02/09/2018 06:53 PM, Dr. David Alan Gilbert wrote:
> * Wei Wang (wei.w.wang@intel.com) wrote:
>> On 02/09/2018 04:15 AM, Dr. David Alan Gilbert wrote:
>>> * Wei Wang (wei.w.wang@intel.com) wrote:
>>>> This is the deivce part implementation to add a new feature,
>>>> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
>>>> receives the guest free page hints from the driver and clears the
>>>> corresponding bits in the dirty bitmap, so that those free pages are
>>>> not transferred by the migration thread to the destination.
>>>>
>>>> Please see the driver patch link for test results:
>>>> https://lkml.org/lkml/2018/2/4/60
>>> Hi Wei,
>>>      I'll look at the code a bit more - but first some more basic
>>> questions on that lkml post:
>>>
>>>       a) The idle guest time thing is a nice result; can you just state
>>>          what the host was, speed of connection, and what other options
>>>          you were using?
>>>
>>>       b) The workload test, the one with the kernel compile; you list
>>>          the kernel compile time but don't mention any changes in the
>>>          migration times of the ping-pong; can you give those times as
>>>          well?
>>>
>>>       c) What's your real workload that this is aimed at?
>>>          Is it really for people migrating idle VMs - or do you have some
>>>          NFV application in mind, if so why not include a figure for
>>>          those?
>>>
>> Hi Dave,
>>
>> Thanks for joining the review. Please see below info.
>>
>> a) Environment info
>>      - Host:
>>          - Physical CPU: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
>>          - kernel: 3.10.0
>>
>>      - Guest:
>>          - kernel: 4.15.0
>>          - QEMU setup: -cpu host -M pc -smp 4,threads=1,sockets=1 -m 8G
>> --mem-prealloc -realtime mlock=on -balloon virtio,free-page-hint=true
>>
>>      - Migration setup:
>>          - migrate_set_speed 0
>>          - migrate_set_downtime 0.01  (10ms)
> That's an unusually low downtime (and I'm not sure what setting the
> speed to 0 does!).

For idle guest tests, I used 0.01s downtime. If we run workloads, we can 
change it to 2s. Just make sure we set the same downtime for both legacy 
and optimization cases so that we can have an apple to apple comparison.

speed being set to 0 means using the largest bandwidth. Here it is 
effectively the same as setting it to 100G.

>
>> b) Michael asked the same question on the kernel patches, I'll reply there
>> with you cc-ed, so that kernel maintainers can also see it. Btw, do you have
>> any other workloads you would suggest to have a try?
> No, not really; I guess it's best for VMs that are either idle or have
> lots of spare RAM.

Yes, less free memory results in less improvement (please see the more 
results with the linux compilation workload shared in LKML).


Best,
Wei