This commit adds new memorydevices.rst page which should serve
all models of memory devices. Yet, I'm documenting virtio-mem
quirks only.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
docs/kbase/index.rst | 4 +
docs/kbase/memorydevices.rst | 150 +++++++++++++++++++++++++++++++++++
docs/kbase/meson.build | 1 +
3 files changed, 155 insertions(+)
create mode 100644 docs/kbase/memorydevices.rst
diff --git a/docs/kbase/index.rst b/docs/kbase/index.rst
index 91083ee49d..6355fe4f1d 100644
--- a/docs/kbase/index.rst
+++ b/docs/kbase/index.rst
@@ -52,6 +52,10 @@ Usage
`PCI topology <../pci-addresses.html>`__
Addressing schemes for PCI devices
+`Memory devices <memorydevices.html>`__
+ Memory devices and their use
+
+
Internals / Debugging
---------------------
diff --git a/docs/kbase/memorydevices.rst b/docs/kbase/memorydevices.rst
new file mode 100644
index 0000000000..23ccd6da88
--- /dev/null
+++ b/docs/kbase/memorydevices.rst
@@ -0,0 +1,150 @@
+==============
+Memory devices
+==============
+
+.. contents::
+
+Basics
+======
+
+Memory devices can be divided into two families: volatile and non-volatile.
+The former is typical RAM memory: it's volatile and thus its contents doesn't
+survive reboots nor guest shut downs and power ons. The latter retains its
+contents across reboots or power outages.
+
+In Libvirt, there are two models for volatile memory:
+
+* ``dimm`` model:
+
+ ::
+
+ <memory model='dimm'>
+ <target>
+ <size unit='KiB'>523264</size>
+ <node>0</node>
+ </target>
+ <address type='dimm' slot='0'/>
+ </memory>
+
+* ``virtio-mem`` model:
+
+ ::
+
+ <memory model='virtio-mem'>
+ <target>
+ <size unit='KiB'>1048576</size>
+ <node>0</node>
+ <block unit='KiB'>2048</block>
+ <requested unit='KiB'>524288</requested>
+ </target>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
+ </memory>
+
+Then there are two models for non-volatile memory:
+
+* ``nvidmm`` model:
+
+ ::
+
+ <memory model='nvdimm'>
+ <source>
+ <path>/tmp/nvdimm</path>
+ </source>
+ <target>
+ <size unit='KiB'>523264</size>
+ <node>0</node>
+ </target>
+ <address type='dimm' slot='0'/>
+ </memory>
+
+* ``virtio-pmem`` model:
+
+ ::
+
+ <memory model='virtio-pmem' access='shared'>
+ <source>
+ <path>/tmp/virtio_pmem</path>
+ </source>
+ <target>
+ <size unit='KiB'>524288</size>
+ </target>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
+ </memory>
+
+
+Please note that (maybe somewhat surprisingly) virtio models go onto PCI bus
+instead of DIMM slots.
+
+Furthermore, DIMMs can have ``<source/>`` element which configures backend for
+devices. For NVDIMMs the element is mandatory and reflects where the contents
+is saved.
+
+See `memory devices documentation <../formatdomain.html#elementsMemory>`_.
+
+``virtio-mem`` model
+====================
+
+The ``virtio-mem`` model can be viewed as revised memory balloon. It offers
+adding and removing memory (without the actual hotplug of the device). It
+solves problems that memory balloon can't solve on its own and thus is more
+flexible than DIMM + balloon solution. ``virtio-mem`` is NUMA aware, and thus
+memory can be inflated/deflated only for a subset of guest NUMA nodes. Also,
+it works with chunks that are either exposed to guest or taken back from it.
+
+See https://virtio-mem.gitlab.io/
+
+Under the hood, ``virtio-mem`` device is split into chunks of equal size which
+are then exposed to the guest. Either all of them or only a portion depending
+on user's request. Therefore there are three important sizes for
+``virtio-mem``. All are to be found under ``<target/>`` element:
+
+#. The maximum size the device can ever offer, exposed under ``<size/>``
+#. The size of a single block, exposed under ``<block/>``
+#. The current size exposed to the guest, exposed under ``<requested/>``
+
+For instance, the following example the maximum size is 4GiB, the block size is
+2MiB and only 1GiB should be exposed to the guest:
+
+ ::
+
+ <memory model='virtio-mem'>
+ <target>
+ <size unit='KiB'>4194304</size>
+ <block unit='KiB'>2048</block>
+ <requested unit='KiB'>1048576</requested>
+ </target>
+ </memory>
+
+Please note that ``<requested/>`` must be an integer multiple of ``<block/>``
+size or zero (no blocks exposed to the guest) and has to be less or equal to
+``<size/>`` (all blocks exposed to the guest). Furthermore, QEMU recommends the
+``<block/>`` size to be as big as a Transparent Huge Page (usually 2MiB).
+
+To change the size exposed to the guest, users should pass memory device XML
+with nothing but ``<requested/>`` changed into the
+``virDomainUpdateDeviceFlags()`` API. For user's convenience this can be done
+via virsh too:
+
+ ::
+
+ # virsh update-memory-device $dom --requested-size 2GiB
+
+If there are two or more ``<memory/>`` devices then ``--alias`` shall be used
+to tell virsh which memory device should be updated.
+
+For running guests there is fourth size that can be found under ``<target/>``:
+
+ ::
+
+ <actual unit='KiB'>2097152</actual>
+
+The ``<actual/>`` reflects the actual size used by the guest. In general it
+can differ from ``<requested/>``. Reasons include guest kernel missing
+``virtio-mem`` module and thus being unable to take offered memory, or guest
+kernel being unable to free memory. Since ``<actual/>`` only reports size to
+users, the element is never parsed. It is formatted only into live XML.
+
+Since changing actual allocation requires cooperation with guest kernel,
+requests for change are not instant. Therefore, libvirt emits
+``VIR_DOMAIN_EVENT_ID_MEMORY_DEVICE_SIZE_CHANGE`` event whenever actual
+allocation changed.
diff --git a/docs/kbase/meson.build b/docs/kbase/meson.build
index 7631b47018..f93f687efb 100644
--- a/docs/kbase/meson.build
+++ b/docs/kbase/meson.build
@@ -10,6 +10,7 @@ docs_kbase_files = [
'locking-lockd',
'locking',
'locking-sanlock',
+ 'memorydevices',
'merging_disk_image_chains',
'migrationinternals',
'qemu-passthrough-security',
--
2.31.1
On 6/23/21 4:12 AM, Michal Privoznik wrote: > This commit adds new memorydevices.rst page which should serve > all models of memory devices. Yet, I'm documenting virtio-mem > quirks only. > > Signed-off-by: Michal Privoznik <mprivozn@redhat.com> > --- > docs/kbase/index.rst | 4 + > docs/kbase/memorydevices.rst | 150 +++++++++++++++++++++++++++++++++++ > docs/kbase/meson.build | 1 + > 3 files changed, 155 insertions(+) > create mode 100644 docs/kbase/memorydevices.rst > > diff --git a/docs/kbase/index.rst b/docs/kbase/index.rst > index 91083ee49d..6355fe4f1d 100644 > --- a/docs/kbase/index.rst > +++ b/docs/kbase/index.rst > @@ -52,6 +52,10 @@ Usage > `PCI topology <../pci-addresses.html>`__ > Addressing schemes for PCI devices > > +`Memory devices <memorydevices.html>`__ > + Memory devices and their use > + > + > Internals / Debugging > --------------------- > > diff --git a/docs/kbase/memorydevices.rst b/docs/kbase/memorydevices.rst > new file mode 100644 > index 0000000000..23ccd6da88 > --- /dev/null > +++ b/docs/kbase/memorydevices.rst > @@ -0,0 +1,150 @@ > +============== > +Memory devices > +============== > + > +.. contents:: > + > +Basics > +====== > + > +Memory devices can be divided into two families: volatile and non-volatile. > +The former is typical RAM memory: it's volatile and thus its contents doesn't > +survive reboots nor guest shut downs and power ons. The last part of this sentence is a little awkward. How about something like "... its contents doesn't survive guest reboots or power cycles." ? > The latter retains its > +contents across reboots or power outages. > + > +In Libvirt, there are two models for volatile memory: > + > +* ``dimm`` model: > + > + :: > + > + <memory model='dimm'> > + <target> > + <size unit='KiB'>523264</size> > + <node>0</node> > + </target> > + <address type='dimm' slot='0'/> > + </memory> > + > +* ``virtio-mem`` model: > + > + :: > + > + <memory model='virtio-mem'> > + <target> > + <size unit='KiB'>1048576</size> > + <node>0</node> > + <block unit='KiB'>2048</block> > + <requested unit='KiB'>524288</requested> > + </target> > + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> > + </memory> > + > +Then there are two models for non-volatile memory: > + > +* ``nvidmm`` model: nvdimm > + > + :: > + > + <memory model='nvdimm'> > + <source> > + <path>/tmp/nvdimm</path> > + </source> > + <target> > + <size unit='KiB'>523264</size> > + <node>0</node> > + </target> > + <address type='dimm' slot='0'/> > + </memory> > + > +* ``virtio-pmem`` model: > + > + :: > + > + <memory model='virtio-pmem' access='shared'> > + <source> > + <path>/tmp/virtio_pmem</path> > + </source> > + <target> > + <size unit='KiB'>524288</size> > + </target> > + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> > + </memory> > + > + > +Please note that (maybe somewhat surprisingly) virtio models go onto PCI bus > +instead of DIMM slots. > + > +Furthermore, DIMMs can have ``<source/>`` element which configures backend for > +devices. For NVDIMMs the element is mandatory and reflects where the contents > +is saved. "where the content is saved" or "where the contents are saved" > + > +See `memory devices documentation <../formatdomain.html#elementsMemory>`_. > + > +``virtio-mem`` model > +==================== > + > +The ``virtio-mem`` model can be viewed as revised memory balloon. It offers > +adding and removing memory (without the actual hotplug of the device). It > +solves problems that memory balloon can't solve on its own and thus is more > +flexible than DIMM + balloon solution. ``virtio-mem`` is NUMA aware, and thus > +memory can be inflated/deflated only for a subset of guest NUMA nodes. Also, > +it works with chunks that are either exposed to guest or taken back from it. "or reclaimed from it" ? > + > +See https://virtio-mem.gitlab.io/ > + > +Under the hood, ``virtio-mem`` device is split into chunks of equal size which > +are then exposed to the guest. Either all of them or only a portion depending > +on user's request. Therefore there are three important sizes for > +``virtio-mem``. All are to be found under ``<target/>`` element: > + > +#. The maximum size the device can ever offer, exposed under ``<size/>`` > +#. The size of a single block, exposed under ``<block/>`` > +#. The current size exposed to the guest, exposed under ``<requested/>`` > + > +For instance, the following example the maximum size is 4GiB, the block size is "For instance, in the following example ..." > +2MiB and only 1GiB should be exposed to the guest: > + > + :: > + > + <memory model='virtio-mem'> > + <target> > + <size unit='KiB'>4194304</size> > + <block unit='KiB'>2048</block> > + <requested unit='KiB'>1048576</requested> > + </target> > + </memory> > + > +Please note that ``<requested/>`` must be an integer multiple of ``<block/>`` > +size or zero (no blocks exposed to the guest) and has to be less or equal to > +``<size/>`` (all blocks exposed to the guest). Furthermore, QEMU recommends the > +``<block/>`` size to be as big as a Transparent Huge Page (usually 2MiB). > + > +To change the size exposed to the guest, users should pass memory device XML > +with nothing but ``<requested/>`` changed into the > +``virDomainUpdateDeviceFlags()`` API. For user's convenience this can be done > +via virsh too: > + > + :: > + > + # virsh update-memory-device $dom --requested-size 2GiB > + > +If there are two or more ``<memory/>`` devices then ``--alias`` shall be used > +to tell virsh which memory device should be updated. > + > +For running guests there is fourth size that can be found under ``<target/>``: > + > + :: > + > + <actual unit='KiB'>2097152</actual> > + > +The ``<actual/>`` reflects the actual size used by the guest. In general it > +can differ from ``<requested/>``. Reasons include guest kernel missing > +``virtio-mem`` module and thus being unable to take offered memory, or guest > +kernel being unable to free memory. Since ``<actual/>`` only reports size to > +users, the element is never parsed. It is formatted only into live XML. > + > +Since changing actual allocation requires cooperation with guest kernel, > +requests for change are not instant. Therefore, libvirt emits > +``VIR_DOMAIN_EVENT_ID_MEMORY_DEVICE_SIZE_CHANGE`` event whenever actual > +allocation changed. Nice doc, and nice addition to the KB! Reviewed-by: Jim Fehlig <jfehlig@suse.com> Regards, Jim > diff --git a/docs/kbase/meson.build b/docs/kbase/meson.build > index 7631b47018..f93f687efb 100644 > --- a/docs/kbase/meson.build > +++ b/docs/kbase/meson.build > @@ -10,6 +10,7 @@ docs_kbase_files = [ > 'locking-lockd', > 'locking', > 'locking-sanlock', > + 'memorydevices', > 'merging_disk_image_chains', > 'migrationinternals', > 'qemu-passthrough-security', >
On 6/23/21 5:52 PM, Jim Fehlig wrote: > On 6/23/21 4:12 AM, Michal Privoznik wrote: >> This commit adds new memorydevices.rst page which should serve >> all models of memory devices. Yet, I'm documenting virtio-mem >> quirks only. >> >> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> >> --- >> docs/kbase/index.rst | 4 + >> docs/kbase/memorydevices.rst | 150 +++++++++++++++++++++++++++++++++++ >> docs/kbase/meson.build | 1 + >> 3 files changed, 155 insertions(+) >> create mode 100644 docs/kbase/memorydevices.rst >> > Nice doc, and nice addition to the KB! > > Reviewed-by: Jim Fehlig <jfehlig@suse.com> Thanks, I've made the changes locally for now. Michal
© 2016 - 2026 Red Hat, Inc.