[PATCH v4 14/14] kbase: Document virtio-mem

Michal Privoznik posted 14 patches 4 years, 7 months ago
There is a newer version of this series
[PATCH v4 14/14] kbase: Document virtio-mem
Posted by Michal Privoznik 4 years, 7 months ago
This commit adds new memorydevices.rst page which should serve
all models of memory devices. Yet, I'm documenting virtio-mem
quirks only.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 docs/kbase/index.rst         |   4 +
 docs/kbase/memorydevices.rst | 150 +++++++++++++++++++++++++++++++++++
 docs/kbase/meson.build       |   1 +
 3 files changed, 155 insertions(+)
 create mode 100644 docs/kbase/memorydevices.rst

diff --git a/docs/kbase/index.rst b/docs/kbase/index.rst
index 91083ee49d..6355fe4f1d 100644
--- a/docs/kbase/index.rst
+++ b/docs/kbase/index.rst
@@ -52,6 +52,10 @@ Usage
 `PCI topology <../pci-addresses.html>`__
    Addressing schemes for PCI devices
 
+`Memory devices <memorydevices.html>`__
+   Memory devices and their use
+
+
 Internals / Debugging
 ---------------------
 
diff --git a/docs/kbase/memorydevices.rst b/docs/kbase/memorydevices.rst
new file mode 100644
index 0000000000..23ccd6da88
--- /dev/null
+++ b/docs/kbase/memorydevices.rst
@@ -0,0 +1,150 @@
+==============
+Memory devices
+==============
+
+.. contents::
+
+Basics
+======
+
+Memory devices can be divided into two families: volatile and non-volatile.
+The former is typical RAM memory: it's volatile and thus its contents doesn't
+survive reboots nor guest shut downs and power ons. The latter retains its
+contents across reboots or power outages.
+
+In Libvirt, there are two models for volatile memory:
+
+* ``dimm`` model:
+
+  ::
+
+    <memory model='dimm'>
+      <target>
+        <size unit='KiB'>523264</size>
+        <node>0</node>
+      </target>
+      <address type='dimm' slot='0'/>
+    </memory>
+
+* ``virtio-mem`` model:
+
+  ::
+
+    <memory model='virtio-mem'>
+      <target>
+        <size unit='KiB'>1048576</size>
+        <node>0</node>
+        <block unit='KiB'>2048</block>
+        <requested unit='KiB'>524288</requested>
+      </target>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
+    </memory>
+
+Then there are two models for non-volatile memory:
+
+* ``nvidmm`` model:
+
+  ::
+
+    <memory model='nvdimm'>
+      <source>
+        <path>/tmp/nvdimm</path>
+      </source>
+      <target>
+        <size unit='KiB'>523264</size>
+        <node>0</node>
+      </target>
+      <address type='dimm' slot='0'/>
+    </memory>
+
+* ``virtio-pmem`` model:
+
+  ::
+
+    <memory model='virtio-pmem' access='shared'>
+      <source>
+        <path>/tmp/virtio_pmem</path>
+      </source>
+      <target>
+        <size unit='KiB'>524288</size>
+      </target>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
+    </memory>
+
+
+Please note that (maybe somewhat surprisingly) virtio models go onto PCI bus
+instead of DIMM slots.
+
+Furthermore, DIMMs can have ``<source/>`` element which configures backend for
+devices. For NVDIMMs the element is mandatory and reflects where the contents
+is saved.
+
+See `memory devices documentation <../formatdomain.html#elementsMemory>`_.
+
+``virtio-mem`` model
+====================
+
+The ``virtio-mem`` model can be viewed as revised memory balloon. It offers
+adding and removing memory (without the actual hotplug of the device). It
+solves problems that memory balloon can't solve on its own and thus is more
+flexible than DIMM + balloon solution. ``virtio-mem`` is NUMA aware, and thus
+memory can be inflated/deflated only for a subset of guest NUMA nodes.  Also,
+it works with chunks that are either exposed to guest or taken back from it.
+
+See https://virtio-mem.gitlab.io/
+
+Under the hood, ``virtio-mem`` device is split into chunks of equal size which
+are then exposed to the guest. Either all of them or only a portion depending
+on user's request. Therefore there are three important sizes for
+``virtio-mem``. All are to be found under ``<target/>`` element:
+
+#. The maximum size the device can ever offer, exposed under ``<size/>``
+#. The size of a single block, exposed under ``<block/>``
+#. The current size exposed to the guest, exposed under ``<requested/>``
+
+For instance, the following example the maximum size is 4GiB, the block size is
+2MiB and only 1GiB should be exposed to the guest:
+
+  ::
+
+    <memory model='virtio-mem'>
+      <target>
+        <size unit='KiB'>4194304</size>
+        <block unit='KiB'>2048</block>
+        <requested unit='KiB'>1048576</requested>
+      </target>
+    </memory>
+
+Please note that ``<requested/>`` must be an integer multiple of ``<block/>``
+size or zero (no blocks exposed to the guest) and has to be less or equal to
+``<size/>`` (all blocks exposed to the guest). Furthermore, QEMU recommends the
+``<block/>`` size to be as big as a Transparent Huge Page (usually 2MiB).
+
+To change the size exposed to the guest, users should pass memory device XML
+with nothing but ``<requested/>`` changed into the
+``virDomainUpdateDeviceFlags()`` API. For user's convenience this can be done
+via virsh too:
+
+ ::
+
+   # virsh update-memory-device $dom --requested-size 2GiB
+
+If there are two or more ``<memory/>`` devices then ``--alias`` shall be used
+to tell virsh which memory device should be updated.
+
+For running guests there is fourth size that can be found under ``<target/>``:
+
+  ::
+
+    <actual unit='KiB'>2097152</actual>
+
+The ``<actual/>`` reflects the actual size used by the guest. In general it
+can differ from ``<requested/>``. Reasons include guest kernel missing
+``virtio-mem`` module and thus being unable to take offered memory, or guest
+kernel being unable to free memory.  Since ``<actual/>`` only reports size to
+users, the element is never parsed. It is formatted only into live XML.
+
+Since changing actual allocation requires cooperation with guest kernel,
+requests for change are not instant. Therefore, libvirt emits
+``VIR_DOMAIN_EVENT_ID_MEMORY_DEVICE_SIZE_CHANGE`` event whenever actual
+allocation changed.
diff --git a/docs/kbase/meson.build b/docs/kbase/meson.build
index 7631b47018..f93f687efb 100644
--- a/docs/kbase/meson.build
+++ b/docs/kbase/meson.build
@@ -10,6 +10,7 @@ docs_kbase_files = [
   'locking-lockd',
   'locking',
   'locking-sanlock',
+  'memorydevices',
   'merging_disk_image_chains',
   'migrationinternals',
   'qemu-passthrough-security',
-- 
2.31.1

Re: [PATCH v4 14/14] kbase: Document virtio-mem
Posted by Jim Fehlig 4 years, 7 months ago
On 6/23/21 4:12 AM, Michal Privoznik wrote:
> This commit adds new memorydevices.rst page which should serve
> all models of memory devices. Yet, I'm documenting virtio-mem
> quirks only.
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>   docs/kbase/index.rst         |   4 +
>   docs/kbase/memorydevices.rst | 150 +++++++++++++++++++++++++++++++++++
>   docs/kbase/meson.build       |   1 +
>   3 files changed, 155 insertions(+)
>   create mode 100644 docs/kbase/memorydevices.rst
> 
> diff --git a/docs/kbase/index.rst b/docs/kbase/index.rst
> index 91083ee49d..6355fe4f1d 100644
> --- a/docs/kbase/index.rst
> +++ b/docs/kbase/index.rst
> @@ -52,6 +52,10 @@ Usage
>   `PCI topology <../pci-addresses.html>`__
>      Addressing schemes for PCI devices
>   
> +`Memory devices <memorydevices.html>`__
> +   Memory devices and their use
> +
> +
>   Internals / Debugging
>   ---------------------
>   
> diff --git a/docs/kbase/memorydevices.rst b/docs/kbase/memorydevices.rst
> new file mode 100644
> index 0000000000..23ccd6da88
> --- /dev/null
> +++ b/docs/kbase/memorydevices.rst
> @@ -0,0 +1,150 @@
> +==============
> +Memory devices
> +==============
> +
> +.. contents::
> +
> +Basics
> +======
> +
> +Memory devices can be divided into two families: volatile and non-volatile.
> +The former is typical RAM memory: it's volatile and thus its contents doesn't
> +survive reboots nor guest shut downs and power ons.

The last part of this sentence is a little awkward. How about something like 
"... its contents doesn't survive guest reboots or power cycles." ?

> The latter retains its
> +contents across reboots or power outages.
> +
> +In Libvirt, there are two models for volatile memory:
> +
> +* ``dimm`` model:
> +
> +  ::
> +
> +    <memory model='dimm'>
> +      <target>
> +        <size unit='KiB'>523264</size>
> +        <node>0</node>
> +      </target>
> +      <address type='dimm' slot='0'/>
> +    </memory>
> +
> +* ``virtio-mem`` model:
> +
> +  ::
> +
> +    <memory model='virtio-mem'>
> +      <target>
> +        <size unit='KiB'>1048576</size>
> +        <node>0</node>
> +        <block unit='KiB'>2048</block>
> +        <requested unit='KiB'>524288</requested>
> +      </target>
> +      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
> +    </memory>
> +
> +Then there are two models for non-volatile memory:
> +
> +* ``nvidmm`` model:

nvdimm

> +
> +  ::
> +
> +    <memory model='nvdimm'>
> +      <source>
> +        <path>/tmp/nvdimm</path>
> +      </source>
> +      <target>
> +        <size unit='KiB'>523264</size>
> +        <node>0</node>
> +      </target>
> +      <address type='dimm' slot='0'/>
> +    </memory>
> +
> +* ``virtio-pmem`` model:
> +
> +  ::
> +
> +    <memory model='virtio-pmem' access='shared'>
> +      <source>
> +        <path>/tmp/virtio_pmem</path>
> +      </source>
> +      <target>
> +        <size unit='KiB'>524288</size>
> +      </target>
> +      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
> +    </memory>
> +
> +
> +Please note that (maybe somewhat surprisingly) virtio models go onto PCI bus
> +instead of DIMM slots.
> +
> +Furthermore, DIMMs can have ``<source/>`` element which configures backend for
> +devices. For NVDIMMs the element is mandatory and reflects where the contents
> +is saved.

"where the content is saved" or "where the contents are saved"

> +
> +See `memory devices documentation <../formatdomain.html#elementsMemory>`_.
> +
> +``virtio-mem`` model
> +====================
> +
> +The ``virtio-mem`` model can be viewed as revised memory balloon. It offers
> +adding and removing memory (without the actual hotplug of the device). It
> +solves problems that memory balloon can't solve on its own and thus is more
> +flexible than DIMM + balloon solution. ``virtio-mem`` is NUMA aware, and thus
> +memory can be inflated/deflated only for a subset of guest NUMA nodes.  Also,
> +it works with chunks that are either exposed to guest or taken back from it.

"or reclaimed from it" ?

> +
> +See https://virtio-mem.gitlab.io/
> +
> +Under the hood, ``virtio-mem`` device is split into chunks of equal size which
> +are then exposed to the guest. Either all of them or only a portion depending
> +on user's request. Therefore there are three important sizes for
> +``virtio-mem``. All are to be found under ``<target/>`` element:
> +
> +#. The maximum size the device can ever offer, exposed under ``<size/>``
> +#. The size of a single block, exposed under ``<block/>``
> +#. The current size exposed to the guest, exposed under ``<requested/>``
> +
> +For instance, the following example the maximum size is 4GiB, the block size is

"For instance, in the following example ..."

> +2MiB and only 1GiB should be exposed to the guest:
> +
> +  ::
> +
> +    <memory model='virtio-mem'>
> +      <target>
> +        <size unit='KiB'>4194304</size>
> +        <block unit='KiB'>2048</block>
> +        <requested unit='KiB'>1048576</requested>
> +      </target>
> +    </memory>
> +
> +Please note that ``<requested/>`` must be an integer multiple of ``<block/>``
> +size or zero (no blocks exposed to the guest) and has to be less or equal to
> +``<size/>`` (all blocks exposed to the guest). Furthermore, QEMU recommends the
> +``<block/>`` size to be as big as a Transparent Huge Page (usually 2MiB).
> +
> +To change the size exposed to the guest, users should pass memory device XML
> +with nothing but ``<requested/>`` changed into the
> +``virDomainUpdateDeviceFlags()`` API. For user's convenience this can be done
> +via virsh too:
> +
> + ::
> +
> +   # virsh update-memory-device $dom --requested-size 2GiB
> +
> +If there are two or more ``<memory/>`` devices then ``--alias`` shall be used
> +to tell virsh which memory device should be updated.
> +
> +For running guests there is fourth size that can be found under ``<target/>``:
> +
> +  ::
> +
> +    <actual unit='KiB'>2097152</actual>
> +
> +The ``<actual/>`` reflects the actual size used by the guest. In general it
> +can differ from ``<requested/>``. Reasons include guest kernel missing
> +``virtio-mem`` module and thus being unable to take offered memory, or guest
> +kernel being unable to free memory.  Since ``<actual/>`` only reports size to
> +users, the element is never parsed. It is formatted only into live XML.
> +
> +Since changing actual allocation requires cooperation with guest kernel,
> +requests for change are not instant. Therefore, libvirt emits
> +``VIR_DOMAIN_EVENT_ID_MEMORY_DEVICE_SIZE_CHANGE`` event whenever actual
> +allocation changed.

Nice doc, and nice addition to the KB!

Reviewed-by: Jim Fehlig <jfehlig@suse.com>

Regards,
Jim

> diff --git a/docs/kbase/meson.build b/docs/kbase/meson.build
> index 7631b47018..f93f687efb 100644
> --- a/docs/kbase/meson.build
> +++ b/docs/kbase/meson.build
> @@ -10,6 +10,7 @@ docs_kbase_files = [
>     'locking-lockd',
>     'locking',
>     'locking-sanlock',
> +  'memorydevices',
>     'merging_disk_image_chains',
>     'migrationinternals',
>     'qemu-passthrough-security',
> 

Re: [PATCH v4 14/14] kbase: Document virtio-mem
Posted by Michal Prívozník 4 years, 7 months ago
On 6/23/21 5:52 PM, Jim Fehlig wrote:
> On 6/23/21 4:12 AM, Michal Privoznik wrote:
>> This commit adds new memorydevices.rst page which should serve
>> all models of memory devices. Yet, I'm documenting virtio-mem
>> quirks only.
>>
>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>> ---
>>   docs/kbase/index.rst         |   4 +
>>   docs/kbase/memorydevices.rst | 150 +++++++++++++++++++++++++++++++++++
>>   docs/kbase/meson.build       |   1 +
>>   3 files changed, 155 insertions(+)
>>   create mode 100644 docs/kbase/memorydevices.rst
>>


> Nice doc, and nice addition to the KB!
> 
> Reviewed-by: Jim Fehlig <jfehlig@suse.com>

Thanks, I've made the changes locally for now.

Michal