[RFC PATCH v2 0/5] virtio-balloon: Working Set Reporting

T.J. Alumbaugh posted 5 patches 10 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230525222016.35333-1-talumbau@google.com
Maintainers: "Dr. David Alan Gilbert" <dave@treblig.org>, "Michael S. Tsirkin" <mst@redhat.com>, David Hildenbrand <david@redhat.com>, Markus Armbruster <armbru@redhat.com>, Eric Blake <eblake@redhat.com>, Eduardo Habkost <eduardo@habkost.net>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Yanan Wang <wangyanan55@huawei.com>
hmp-commands.hx                               |  26 ++
hw/core/machine-hmp-cmds.c                    |  21 ++
hw/virtio/virtio-balloon-pci.c                |   2 +
hw/virtio/virtio-balloon.c                    | 239 +++++++++++++++++-
include/hw/virtio/virtio-balloon.h            |  13 +-
include/monitor/hmp.h                         |   2 +
.../standard-headers/linux/virtio_balloon.h   |  20 ++
include/sysemu/balloon.h                      |   9 +-
monitor/monitor.c                             |   1 +
qapi/machine.json                             |  66 +++++
qapi/misc.json                                |  26 ++
softmmu/balloon.c                             |  31 ++-
12 files changed, 449 insertions(+), 7 deletions(-)
[RFC PATCH v2 0/5] virtio-balloon: Working Set Reporting
Posted by T.J. Alumbaugh 10 months, 1 week ago
This is the device implementation for the proposed expanded balloon feature
described here:

https://lore.kernel.org/linux-mm/20230509185419.1088297-1-yuanchu@google.com/

This series has a fixed number of "bins" for the working set report, but this is
not a constraint of the system. The bin number is fixed at device realization
time (in other implementations it is specified as a command line argument). Once
that number is fixed, this determines the correct number of bin intervals to
pass to the QMP/HMP function 'working_set_config'. Any feedback on how to
properly construct that function for this use case (passing a variable length
list?) would be appreciated.

New in V2:
=========

- Patch series is now: header file changes, device changes, QMP changes, HMP
chagnes, and migration changes.

- Exmaple usages of QMP and HMP interface are in their respective commit
messages.

- "ws" -> "working_set" throughout

Motivation
==========
As mentioned in the above message, the use case is a host with overcommitted
memory and 1 or more VMs. The goal is to get both timely and accurate
information on overall memory utilization in order to drive appropriate
reclaim activities, since in some client device use cases a VM might need a
significant fraction of the overall memory for a period of time, but then
enter a quiet period that results in a large number of cold pages in the
guest.

The balloon device now has a number of features to assist in sharing memory
resources amongst the guests and host (e.g free page hinting, stats, free page
reporting). As mentioned in slide 12 in [1], the balloon doesn't have a good
mechanism to drive the reclaim of guest cache. Our use case includes both
typical page cache as well as "application caches" with memory that should be
discarded in times of system-wide memory pressure. In some cases, virtio-pmem
can be a method for host control of guest cache but there are undesirable
security implications.

Working Set Reporting
=====================
The patch series here includes:

 - Actual device implementation for VIRTIO_F_WS_REPORTING to standardize the
   configuration and communication of Working Set reports from the guest. This
   includes a notification virtqueue for receiving config information and
   requests for a report (a feature which could be expanded for additional use
   cases) and a virtqueue for the actual report from the driver.

 - QMP changes so that a controller program can use the existing QEMU socket
   mechanism to configure and request WS reports and then read the reports as
   a JSON property on the balloon.

Working Set reporting in the balloon provides:

 - an accurate picture of current memory utilization in the guest
 - event driven reporting (with configurable rate limiting) to deliver reports
   during times of memory pressure.

The reporting mechanism can be combined with a domain-specific balloon policy
to drive the separate reclaim activities in a coordinated fashion.

TODOs:
======
 -  A synchronization mechanism must be added to the functions that send WS
    Config and WS Request, otherwise concurrent callers (through QMP) can mix
    messages on the virtqueue sending the data to the driver.

 - The device currently has a hard-coded setting of 4 'bins' for a Working Set
   report, whereas the specification calls for anywhere between 2 and 16.

 - A WS_EVENT notification through QMP should include the actual report,
   whereas right now we query for that information right after a WS_EVENT is
   received.

References:

[1] https://kvmforum2020.sched.com/event/eE4U/virtio-balloonpmemmem-managing-guest-memory-david-hildenbrand-michael-s-tsirkin-red-hat

T.J. Alumbaugh (5):
  virtio-balloon: Add Working Set Reporting feature
  virtio-balloon: device has Working Set Reporting
  virtio-balloon: Add QMP functions for Working Set
  virtio-balloon: Add HMP functions for Working Set
  virtio-balloon: Migration of working set config

 hmp-commands.hx                               |  26 ++
 hw/core/machine-hmp-cmds.c                    |  21 ++
 hw/virtio/virtio-balloon-pci.c                |   2 +
 hw/virtio/virtio-balloon.c                    | 239 +++++++++++++++++-
 include/hw/virtio/virtio-balloon.h            |  13 +-
 include/monitor/hmp.h                         |   2 +
 .../standard-headers/linux/virtio_balloon.h   |  20 ++
 include/sysemu/balloon.h                      |   9 +-
 monitor/monitor.c                             |   1 +
 qapi/machine.json                             |  66 +++++
 qapi/misc.json                                |  26 ++
 softmmu/balloon.c                             |  31 ++-
 12 files changed, 449 insertions(+), 7 deletions(-)

-- 
2.41.0.rc0.172.g3f132b7071-goog
Re: [RFC PATCH v2 0/5] virtio-balloon: Working Set Reporting
Posted by David Hildenbrand 10 months ago
On 26.05.23 00:20, T.J. Alumbaugh wrote:
> This is the device implementation for the proposed expanded balloon feature
> described here:
> 
> https://lore.kernel.org/linux-mm/20230509185419.1088297-1-yuanchu@google.com/
> 
> This series has a fixed number of "bins" for the working set report, but this is
> not a constraint of the system. The bin number is fixed at device realization
> time (in other implementations it is specified as a command line argument). Once
> that number is fixed, this determines the correct number of bin intervals to
> pass to the QMP/HMP function 'working_set_config'. Any feedback on how to
> properly construct that function for this use case (passing a variable length
> list?) would be appreciated.
> 
> New in V2:
> =========
> 
> - Patch series is now: header file changes, device changes, QMP changes, HMP
> chagnes, and migration changes.
> 
> - Exmaple usages of QMP and HMP interface are in their respective commit
> messages.
> 
> - "ws" -> "working_set" throughout
> 
> Motivation
> ==========
> As mentioned in the above message, the use case is a host with overcommitted
> memory and 1 or more VMs. The goal is to get both timely and accurate
> information on overall memory utilization in order to drive appropriate
> reclaim activities, since in some client device use cases a VM might need a
> significant fraction of the overall memory for a period of time, but then
> enter a quiet period that results in a large number of cold pages in the
> guest.
> 
> The balloon device now has a number of features to assist in sharing memory
> resources amongst the guests and host (e.g free page hinting, stats, free page
> reporting). As mentioned in slide 12 in [1], the balloon doesn't have a good
> mechanism to drive the reclaim of guest cache. Our use case includes both
> typical page cache as well as "application caches" with memory that should be
> discarded in times of system-wide memory pressure. In some cases, virtio-pmem
> can be a method for host control of guest cache but there are undesirable
> security implications.
> 
> Working Set Reporting
> =====================
> The patch series here includes:
> 
>   - Actual device implementation for VIRTIO_F_WS_REPORTING to standardize the
>     configuration and communication of Working Set reports from the guest. This
>     includes a notification virtqueue for receiving config information and
>     requests for a report (a feature which could be expanded for additional use
>     cases) and a virtqueue for the actual report from the driver.

Could the config update be modeled using the config space instead?

Is the report asynchronous to the request, or how exactly do requests 
and reports interact?

> 
>   - QMP changes so that a controller program can use the existing QEMU socket
>     mechanism to configure and request WS reports and then read the reports as
>     a JSON property on the balloon.
> 
> Working Set reporting in the balloon provides:
> 
>   - an accurate picture of current memory utilization in the guest
>   - event driven reporting (with configurable rate limiting) to deliver reports
>     during times of memory pressure.
> 
> The reporting mechanism can be combined with a domain-specific balloon policy
> to drive the separate reclaim activities in a coordinated fashion.
> 
> TODOs:
> ======
>   -  A synchronization mechanism must be added to the functions that send WS
>      Config and WS Request, otherwise concurrent callers (through QMP) can mix
>      messages on the virtqueue sending the data to the driver.
> 
>   - The device currently has a hard-coded setting of 4 'bins' for a Working Set
>     report, whereas the specification calls for anywhere between 2 and 16.

Can you briefly summarize what a bin is, how one would decide how many 
bins one wants and what the whole purpose of a bin is?

Thanks!

-- 
Thanks,

David / dhildenb