The virtio-pmem is a virtio variant of NVDIMM and just like
NVDIMM virtio-pmem also allows accessing host pages bypassing
guest page cache. The difference is that if a regular file is
used to back guest's NVDIMM (model='nvdimm') the persistence of
guest writes might not be guaranteed while with virtio-pmem it
is.
To express this new model at domain XML level, I've chose the
following:
<memory model='virtio' access='shared'>
<source>
<path>/tmp/virtio_pmem</path>
<pmem/>
</source>
<target>
<size unit='KiB'>524288</size>
</target>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</memory>
The <source/> looks like for model='nvdimm', except <pmem/> is
required - this is to allow for future expansion when the
model='virtio' without <pmem/> will describe virtio-mem (which is
a different device).
Another difference between NVDIMM and virtio-pmem is that while
the former supports NUMA node locality the latter doesn't. And
also, the latter goes onto PCI bus and not into a DIMM module.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
docs/formatdomain.rst | 35 +++++-
docs/schemas/domaincommon.rng | 1 +
src/conf/domain_conf.c | 113 ++++++++++++++++--
src/conf/domain_conf.h | 5 +
src/qemu/qemu_alias.c | 1 +
src/qemu/qemu_command.c | 2 +
src/qemu/qemu_domain.c | 26 ++++
src/qemu/qemu_domain_address.c | 88 +++++++++++---
src/qemu/qemu_domain_address.h | 3 +-
src/qemu/qemu_hotplug.c | 2 +-
src/qemu/qemu_validate.c | 1 +
src/security/security_apparmor.c | 1 +
src/security/security_dac.c | 2 +
src/security/security_selinux.c | 2 +
.../memory-hotplug-virtio-pmem.xml | 54 +++++++++
...mory-hotplug-virtio-pmem.x86_64-latest.xml | 1 +
tests/qemuxml2xmltest.c | 2 +
17 files changed, 304 insertions(+), 35 deletions(-)
create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-pmem.xml
create mode 120000 tests/qemuxml2xmloutdata/memory-hotplug-virtio-pmem.x86_64-latest.xml
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index 512939679b..ca6bc0432e 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
@@ -7187,20 +7187,31 @@ Example: usage of the memory devices
</label>
</target>
</memory>
+ <memory model='virtio' access='shared'>
+ <source>
+ <path>/tmp/virtio_pmem</path>
+ <pmem/>
+ </source>
+ <target>
+ <size unit='KiB'>524288</size>
+ </target>
+ </memory>
</devices>
...
``model``
Provide ``dimm`` to add a virtual DIMM module to the guest. :since:`Since
- 1.2.14` Provide ``nvdimm`` model adds a Non-Volatile DIMM module.
- :since:`Since 3.2.0`
+ 1.2.14` Provide ``nvdimm`` model that adds a Non-Volatile DIMM module.
+ :since:`Since 3.2.0` Provide ``virtio`` model to add a paravirtualized
+ (persistent) memory device. :since:`Since 7.0.0`
``access``
An optional attribute ``access`` ( :since:`since 3.2.0` ) that provides
capability to fine tune mapping of the memory on per module basis. Values are
the same as `Memory Backing <#elementsMemoryBacking>`__: ``shared`` and
``private``. For ``nvdimm`` model, if using real NVDIMM DAX device as
- backend, ``shared`` is required.
+ backend, ``shared`` is required. For ``virtio`` model with ``<pmem/>``
+ ``shared`` is required.
``discard``
An optional attribute ``discard`` ( :since:`since 4.4.0` ) that provides
@@ -7229,9 +7240,9 @@ Example: usage of the memory devices
This element can be used to override the default set of NUMA nodes where
the memory would be allocated.
- For model ``nvdimm`` this element is mandatory. The mandatory child element
- ``path`` represents a path in the host that backs the nvdimm module in the
- guest. The following optional elements may be used:
+ For model ``nvdimm`` the ``source`` element is mandatory. The mandatory
+ child element ``path`` represents a path in the host that backs the nvdimm
+ module in the guest. The following optional elements may be used:
``alignsize``
The ``alignsize`` element defines the page size alignment used to mmap the
@@ -7245,6 +7256,18 @@ Example: usage of the memory devices
to guarantee the persistence of writes to the vNVDIMM backend, then use
the ``pmem`` element in order to utilize the feature. :since:`Since 5.0.0`
+ For model ``virtio`` the ``source`` element is mandatory. The following
+ optional elements may be used:
+
+ ``path``
+ Represents a path in the host that backs the virtio memory module in the
+ guest. It is mandatory for persistent memory (i.e. when ``pmem`` is also
+ set).
+
+ ``pmem``
+ If persistence of data is required then this element must be provided.
+
+
``target``
The mandatory ``target`` element configures the placement and sizing of the
added memory from the perspective of the guest.
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index 795b654feb..b385bae84c 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -5973,6 +5973,7 @@
<choice>
<value>dimm</value>
<value>nvdimm</value>
+ <value>virtio</value>
</choice>
</attribute>
<optional>
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 4d462b0627..935bea1804 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -1332,6 +1332,7 @@ VIR_ENUM_IMPL(virDomainMemoryModel,
"",
"dimm",
"nvdimm",
+ "virtio",
);
VIR_ENUM_IMPL(virDomainShmemModel,
@@ -3143,6 +3144,9 @@ void virDomainMemoryDefFree(virDomainMemoryDefPtr def)
case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
VIR_FREE(def->s.nvdimm.path);
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ VIR_FREE(def->s.virtio.path);
+ break;
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
break;
@@ -5373,15 +5377,31 @@ static int
virDomainMemoryDefPostParse(virDomainMemoryDefPtr mem,
const virDomainDef *def)
{
- /* Although only the QEMU driver implements PPC64 support, this
- * code is related to the platform specification (PAPR), i.e. it
- * is hypervisor agnostic, and any future PPC64 hypervisor driver
- * will have the same restriction.
- */
- if (ARCH_IS_PPC64(def->os.arch) &&
- mem->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM &&
- virDomainNVDimmAlignSizePseries(mem) < 0)
- return -1;
+ switch (mem->model) {
+ case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
+ /* Although only the QEMU driver implements PPC64 support, this
+ * code is related to the platform specification (PAPR), i.e. it
+ * is hypervisor agnostic, and any future PPC64 hypervisor driver
+ * will have the same restriction.
+ */
+ if (ARCH_IS_PPC64(def->os.arch) &&
+ virDomainNVDimmAlignSizePseries(mem) < 0)
+ return -1;
+ break;
+
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ /* Virtio-pmem mandates shared access so that guest writes get
+ * reflected in the underlying file. */
+ if (mem->s.virtio.pmem &&
+ mem->access == VIR_DOMAIN_MEMORY_ACCESS_DEFAULT)
+ mem->access = VIR_DOMAIN_MEMORY_ACCESS_SHARED;
+ break;
+
+ case VIR_DOMAIN_MEMORY_MODEL_DIMM:
+ case VIR_DOMAIN_MEMORY_MODEL_NONE:
+ case VIR_DOMAIN_MEMORY_MODEL_LAST:
+ break;
+ }
return 0;
}
@@ -6703,7 +6723,8 @@ static int
virDomainMemoryDefValidate(const virDomainMemoryDef *mem,
const virDomainDef *def)
{
- if (mem->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM) {
+ switch (mem->model) {
+ case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
if (!mem->s.nvdimm.path) {
virReportError(VIR_ERR_XML_DETAIL, "%s",
_("path is required for model 'nvdimm'"));
@@ -6721,6 +6742,39 @@ virDomainMemoryDefValidate(const virDomainMemoryDef *mem,
_("label size is required for NVDIMM device"));
return -1;
}
+ break;
+
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ if (mem->s.virtio.pmem) {
+ if (!mem->s.virtio.path) {
+ virReportError(VIR_ERR_XML_DETAIL, "%s",
+ _("path is required for model 'virtio'"));
+ return -1;
+ }
+
+ if (mem->access != VIR_DOMAIN_MEMORY_ACCESS_SHARED) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("shared access mode required for virtio-pmem device"));
+ return -1;
+ }
+
+ if (mem->targetNode != -1) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("virtio-pmem does not support NUMA nodes"));
+ return -1;
+ }
+ } else {
+ /* TODO: plain virtio-mem behaves differently then virtio-pmem */
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("virtio-mem is not supported yet. <pmem/> is required"));
+ return -1;
+ }
+ break;
+
+ case VIR_DOMAIN_MEMORY_MODEL_NONE:
+ case VIR_DOMAIN_MEMORY_MODEL_DIMM:
+ case VIR_DOMAIN_MEMORY_MODEL_LAST:
+ break;
}
return 0;
@@ -16717,6 +16771,14 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node,
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ def->s.virtio.path = virXPathString("string(./path)", ctxt);
+
+ if (virXPathBoolean("boolean(./pmem)", ctxt))
+ def->s.virtio.pmem = true;
+
+ break;
+
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
break;
@@ -18605,6 +18667,12 @@ virDomainMemoryFindByDefInternal(virDomainDefPtr def,
continue;
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ if (tmp->s.virtio.pmem != mem->s.virtio.pmem ||
+ STRNEQ_NULLABLE(tmp->s.virtio.path, mem->s.virtio.path))
+ continue;
+ break;
+
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
break;
@@ -24187,7 +24255,8 @@ virDomainMemoryDefCheckABIStability(virDomainMemoryDefPtr src,
return false;
}
- if (src->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM) {
+ switch (src->model) {
+ case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
if (src->labelsize != dst->labelsize) {
virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
_("Target NVDIMM label size '%llu' doesn't match "
@@ -24223,6 +24292,21 @@ virDomainMemoryDefCheckABIStability(virDomainMemoryDefPtr src,
_("Target NVDIMM UUID doesn't match source NVDIMM"));
return false;
}
+ break;
+
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ if (src->s.virtio.pmem != dst->s.virtio.pmem) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("Target NVDIMM pmem flag doesn't match "
+ "source NVDIMM pmem flag"));
+ return false;
+ }
+ break;
+
+ case VIR_DOMAIN_MEMORY_MODEL_NONE:
+ case VIR_DOMAIN_MEMORY_MODEL_DIMM:
+ case VIR_DOMAIN_MEMORY_MODEL_LAST:
+ break;
}
return virDomainDeviceInfoCheckABIStability(&src->info, &dst->info);
@@ -27878,6 +27962,13 @@ virDomainMemorySourceDefFormat(virBufferPtr buf,
virBufferAddLit(&childBuf, "<pmem/>\n");
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ virBufferEscapeString(&childBuf, "<path>%s</path>\n", def->s.virtio.path);
+
+ if (def->s.virtio.pmem)
+ virBufferAddLit(&childBuf, "<pmem/>\n");
+ break;
+
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
break;
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index e7f8fc156f..efaa4c5473 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -2306,6 +2306,7 @@ typedef enum {
VIR_DOMAIN_MEMORY_MODEL_NONE,
VIR_DOMAIN_MEMORY_MODEL_DIMM, /* dimm hotpluggable memory device */
VIR_DOMAIN_MEMORY_MODEL_NVDIMM, /* nvdimm memory device */
+ VIR_DOMAIN_MEMORY_MODEL_VIRTIO, /* virtio-{p}mem memory device */
VIR_DOMAIN_MEMORY_MODEL_LAST
} virDomainMemoryModel;
@@ -2325,6 +2326,10 @@ struct _virDomainMemoryDef {
unsigned long long alignsize; /* kibibytes */
bool pmem;
} nvdimm; /* VIR_DOMAIN_MEMORY_MODEL_NVDIMM */
+ struct {
+ char *path; /* Required for pmem, otherwise optional */
+ bool pmem;
+ } virtio; /* VIR_DOMAIN_MEMORY_MODEL_VIRTIO */
} s;
/* target */
diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c
index 5ebcd766a9..931dbd4e84 100644
--- a/src/qemu/qemu_alias.c
+++ b/src/qemu/qemu_alias.c
@@ -498,6 +498,7 @@ qemuAssignDeviceMemoryAlias(virDomainDefPtr def,
case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
prefix = "nvdimm";
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
break;
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 4bd45e0638..9c50778180 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -2986,6 +2986,7 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
alignsize = mem->s.nvdimm.alignsize;
nvdimmPmem = mem->s.nvdimm.pmem;
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
break;
@@ -3339,6 +3340,7 @@ qemuBuildMemoryDeviceStr(virDomainMemoryDefPtr mem)
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
break;
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index e705e8d8d5..ab7938a355 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -8485,6 +8485,8 @@ static int
qemuDomainDefValidateMemoryHotplugDevice(const virDomainMemoryDef *mem,
const virDomainDef *def)
{
+ bool needsNuma = true;
+
switch (mem->model) {
case VIR_DOMAIN_MEMORY_MODEL_DIMM:
case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
@@ -8520,11 +8522,35 @@ qemuDomainDefValidateMemoryHotplugDevice(const virDomainMemoryDef *mem,
}
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI &&
+ mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("only 'pci' addresses are supported for the "
+ "virtio-pmem device"));
+ return -1;
+ }
+
+ /* virtio-pmem doesn't have .node attribute. */
+ if (mem->s.virtio.pmem)
+ needsNuma = false;
+ break;
+
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
return -1;
}
+ if (needsNuma &&
+ virDomainNumaGetNodeCount(def->numa) != 0) {
+ if (mem->targetNode == -1) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("target NUMA node needs to be specified for "
+ "memory device"));
+ return -1;
+ }
+ }
+
return 0;
}
diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c
index d872f75b38..bcf6724da3 100644
--- a/src/qemu/qemu_domain_address.c
+++ b/src/qemu/qemu_domain_address.c
@@ -313,11 +313,11 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def,
virDomainDeviceAddressType type)
{
/*
- declare address-less virtio devices to be of address type 'type'
- disks, networks, videos, consoles, controllers, memballoon and rng
- in this order
- if type is ccw filesystem and vsock devices are declared to be of
- address type ccw
+ Declare address-less virtio devices to be of address type 'type'
+ disks, networks, videos, consoles, controllers, hostdevs, memballoon,
+ rngs and memories in this order.
+ If type is ccw filesystem and vsock devices are declared to be of
+ address type ccw.
*/
size_t i;
@@ -379,6 +379,12 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def,
def->rngs[i]->info.type = type;
}
+ for (i = 0; i < def->nmems; i++) {
+ if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_VIRTIO &&
+ def->mems[i]->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE)
+ def->mems[i]->info.type = type;
+ }
+
if (type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW) {
for (i = 0; i < def->nfss; i++) {
if (def->fss[i]->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE)
@@ -1040,11 +1046,23 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev,
}
break;
+ case VIR_DOMAIN_DEVICE_MEMORY:
+ switch (dev->data.memory->model) {
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ return virtioFlags;
+
+ case VIR_DOMAIN_MEMORY_MODEL_NONE:
+ case VIR_DOMAIN_MEMORY_MODEL_DIMM:
+ case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
+ case VIR_DOMAIN_MEMORY_MODEL_LAST:
+ return 0;
+ }
+ break;
+
/* These devices don't ever connect with PCI */
case VIR_DOMAIN_DEVICE_NVRAM:
case VIR_DOMAIN_DEVICE_TPM:
case VIR_DOMAIN_DEVICE_PANIC:
- case VIR_DOMAIN_DEVICE_MEMORY:
case VIR_DOMAIN_DEVICE_HUB:
case VIR_DOMAIN_DEVICE_REDIRDEV:
case VIR_DOMAIN_DEVICE_SMARTCARD:
@@ -2455,6 +2473,17 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def,
return -1;
}
+ for (i = 0; i < def->nmems; i++) {
+ virDomainMemoryDefPtr mem = def->mems[i];
+
+ if (mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO ||
+ !virDeviceInfoPCIAddressIsWanted(&mem->info))
+ continue;
+
+ if (qemuDomainPCIAddressReserveNextAddr(addrs, &mem->info) < 0)
+ return -1;
+ }
+
return 0;
}
@@ -3102,19 +3131,32 @@ qemuAssignMemoryDeviceSlot(virDomainMemoryDefPtr mem,
int
-qemuDomainAssignMemoryDeviceSlot(virDomainDefPtr def,
+qemuDomainAssignMemoryDeviceSlot(virQEMUDriverPtr driver,
+ virDomainObjPtr vm,
virDomainMemoryDefPtr mem)
{
- virBitmapPtr slotmap = NULL;
- int ret;
+ g_autoptr(virBitmap) slotmap = NULL;
+ virDomainDeviceDef dev = {.type = VIR_DOMAIN_DEVICE_MEMORY, .data.memory = mem};
- if (!(slotmap = qemuDomainGetMemorySlotMap(def)))
- return -1;
+ switch (mem->model) {
+ case VIR_DOMAIN_MEMORY_MODEL_DIMM:
+ case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
+ if (!(slotmap = qemuDomainGetMemorySlotMap(vm->def)))
+ return -1;
- ret = qemuAssignMemoryDeviceSlot(mem, slotmap);
+ return qemuAssignMemoryDeviceSlot(mem, slotmap);
+ break;
- virBitmapFree(slotmap);
- return ret;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ return qemuDomainEnsurePCIAddress(vm, &dev, driver);
+ break;
+
+ case VIR_DOMAIN_MEMORY_MODEL_NONE:
+ case VIR_DOMAIN_MEMORY_MODEL_LAST:
+ break;
+ }
+
+ return 0;
}
@@ -3132,8 +3174,22 @@ qemuDomainAssignMemorySlots(virDomainDefPtr def)
return -1;
for (i = 0; i < def->nmems; i++) {
- if (qemuAssignMemoryDeviceSlot(def->mems[i], slotmap) < 0)
- goto cleanup;
+ virDomainMemoryDefPtr mem = def->mems[i];
+
+ switch (mem->model) {
+ case VIR_DOMAIN_MEMORY_MODEL_DIMM:
+ case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
+ if (qemuAssignMemoryDeviceSlot(def->mems[i], slotmap) < 0)
+ goto cleanup;
+ break;
+
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
+ /* handled in qemuDomainAssignPCIAddresses() */
+ break;
+ case VIR_DOMAIN_MEMORY_MODEL_NONE:
+ case VIR_DOMAIN_MEMORY_MODEL_LAST:
+ break;
+ }
}
ret = 0;
diff --git a/src/qemu/qemu_domain_address.h b/src/qemu/qemu_domain_address.h
index 7ef3308246..20a46160d5 100644
--- a/src/qemu/qemu_domain_address.h
+++ b/src/qemu/qemu_domain_address.h
@@ -56,7 +56,8 @@ void qemuDomainFillDeviceIsolationGroup(virDomainDefPtr def,
void qemuDomainReleaseDeviceAddress(virDomainObjPtr vm,
virDomainDeviceInfoPtr info);
-int qemuDomainAssignMemoryDeviceSlot(virDomainDefPtr def,
+int qemuDomainAssignMemoryDeviceSlot(virQEMUDriverPtr driver,
+ virDomainObjPtr vm,
virDomainMemoryDefPtr mem);
int qemuDomainEnsureVirtioAddress(bool *releaseAddr,
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
index 2c12ef60af..1f9df182bf 100644
--- a/src/qemu/qemu_hotplug.c
+++ b/src/qemu/qemu_hotplug.c
@@ -2402,7 +2402,7 @@ qemuDomainAttachMemory(virQEMUDriverPtr driver,
if (qemuDomainDefValidateMemoryHotplug(vm->def, priv->qemuCaps, mem) < 0)
goto cleanup;
- if (qemuDomainAssignMemoryDeviceSlot(vm->def, mem) < 0)
+ if (qemuDomainAssignMemoryDeviceSlot(driver, vm, mem) < 0)
goto cleanup;
/* in cases where we are using a VM with aliases generated according to the
diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c
index 8ceea022d7..fc9bc7a5b4 100644
--- a/src/qemu/qemu_validate.c
+++ b/src/qemu/qemu_validate.c
@@ -4620,6 +4620,7 @@ qemuValidateDomainDeviceDefMemory(virDomainMemoryDefPtr mem,
}
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
break;
diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c
index 78136e751e..b22ee739d8 100644
--- a/src/security/security_apparmor.c
+++ b/src/security/security_apparmor.c
@@ -693,6 +693,7 @@ AppArmorSetMemoryLabel(virSecurityManagerPtr mgr,
return -1;
}
return reload_profile(mgr, def, mem->s.nvdimm.path, true);
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_DIMM:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
diff --git a/src/security/security_dac.c b/src/security/security_dac.c
index 44ee42e1bd..6b681c4021 100644
--- a/src/security/security_dac.c
+++ b/src/security/security_dac.c
@@ -1892,6 +1892,7 @@ virSecurityDACRestoreMemoryLabel(virSecurityManagerPtr mgr,
ret = virSecurityDACRestoreFileLabel(mgr, mem->s.nvdimm.path);
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
case VIR_DOMAIN_MEMORY_MODEL_DIMM:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
case VIR_DOMAIN_MEMORY_MODEL_NONE:
@@ -2074,6 +2075,7 @@ virSecurityDACSetMemoryLabel(virSecurityManagerPtr mgr,
user, group, true);
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
case VIR_DOMAIN_MEMORY_MODEL_DIMM:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
case VIR_DOMAIN_MEMORY_MODEL_NONE:
diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c
index 294c9f1db5..77b69447da 100644
--- a/src/security/security_selinux.c
+++ b/src/security/security_selinux.c
@@ -1582,6 +1582,7 @@ virSecuritySELinuxSetMemoryLabel(virSecurityManagerPtr mgr,
return -1;
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_DIMM:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
@@ -1609,6 +1610,7 @@ virSecuritySELinuxRestoreMemoryLabel(virSecurityManagerPtr mgr,
ret = virSecuritySELinuxRestoreFileLabel(mgr, mem->s.nvdimm.path, true);
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
case VIR_DOMAIN_MEMORY_MODEL_DIMM:
case VIR_DOMAIN_MEMORY_MODEL_NONE:
case VIR_DOMAIN_MEMORY_MODEL_LAST:
diff --git a/tests/qemuxml2argvdata/memory-hotplug-virtio-pmem.xml b/tests/qemuxml2argvdata/memory-hotplug-virtio-pmem.xml
new file mode 100644
index 0000000000..3570b7d1b1
--- /dev/null
+++ b/tests/qemuxml2argvdata/memory-hotplug-virtio-pmem.xml
@@ -0,0 +1,54 @@
+<domain type='kvm'>
+ <name>QEMUGuest1</name>
+ <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+ <maxMemory slots='16' unit='KiB'>1099511627776</maxMemory>
+ <memory unit='KiB'>2095104</memory>
+ <currentMemory unit='KiB'>2095104</currentMemory>
+ <vcpu placement='static' cpuset='0-1'>2</vcpu>
+ <os>
+ <type arch='x86_64' machine='pc'>hvm</type>
+ <boot dev='hd'/>
+ </os>
+ <cpu mode='custom' match='exact' check='none'>
+ <model fallback='forbid'>qemu64</model>
+ <topology sockets='2' dies='1' cores='1' threads='1'/>
+ <numa>
+ <cell id='0' cpus='0-1' memory='2095104' unit='KiB'/>
+ </numa>
+ </cpu>
+ <clock offset='utc'/>
+ <on_poweroff>destroy</on_poweroff>
+ <on_reboot>restart</on_reboot>
+ <on_crash>destroy</on_crash>
+ <devices>
+ <emulator>/usr/bin/qemu-system-i386</emulator>
+ <disk type='block' device='disk'>
+ <driver name='qemu' type='raw'/>
+ <source dev='/dev/HostVG/QEMUGuest1'/>
+ <target dev='hda' bus='ide'/>
+ <address type='drive' controller='0' bus='0' target='0' unit='0'/>
+ </disk>
+ <controller type='ide' index='0'>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
+ </controller>
+ <controller type='usb' index='0' model='piix3-uhci'>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
+ </controller>
+ <controller type='pci' index='0' model='pci-root'/>
+ <input type='mouse' bus='ps2'/>
+ <input type='keyboard' bus='ps2'/>
+ <memballoon model='virtio'>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
+ </memballoon>
+ <memory model='virtio' access='shared'>
+ <source>
+ <path>/tmp/virtio_pmem</path>
+ <pmem/>
+ </source>
+ <target>
+ <size unit='KiB'>524288</size>
+ </target>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
+ </memory>
+ </devices>
+</domain>
diff --git a/tests/qemuxml2xmloutdata/memory-hotplug-virtio-pmem.x86_64-latest.xml b/tests/qemuxml2xmloutdata/memory-hotplug-virtio-pmem.x86_64-latest.xml
new file mode 120000
index 0000000000..904425abe4
--- /dev/null
+++ b/tests/qemuxml2xmloutdata/memory-hotplug-virtio-pmem.x86_64-latest.xml
@@ -0,0 +1 @@
+../qemuxml2argvdata/memory-hotplug-virtio-pmem.xml
\ No newline at end of file
diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c
index c006719dee..e804db8aee 100644
--- a/tests/qemuxml2xmltest.c
+++ b/tests/qemuxml2xmltest.c
@@ -1238,6 +1238,8 @@ mymain(void)
QEMU_CAPS_DEVICE_NVDIMM_UNARMED);
DO_TEST("memory-hotplug-nvdimm-ppc64", QEMU_CAPS_DEVICE_SPAPR_PCI_HOST_BRIDGE,
QEMU_CAPS_DEVICE_NVDIMM);
+ DO_TEST_CAPS_LATEST("memory-hotplug-virtio-pmem");
+
DO_TEST("net-udp", NONE);
DO_TEST("video-virtio-gpu-device", QEMU_CAPS_DEVICE_VIRTIO_GPU);
--
2.26.2
On Thu, Dec 03, 2020 at 13:36:15 +0100, Michal Privoznik wrote:
> The virtio-pmem is a virtio variant of NVDIMM and just like
> NVDIMM virtio-pmem also allows accessing host pages bypassing
> guest page cache. The difference is that if a regular file is
> used to back guest's NVDIMM (model='nvdimm') the persistence of
> guest writes might not be guaranteed while with virtio-pmem it
> is.
>
> To express this new model at domain XML level, I've chose the
> following:
>
> <memory model='virtio' access='shared'>
> <source>
> <path>/tmp/virtio_pmem</path>
> <pmem/>
> </source>
> <target>
> <size unit='KiB'>524288</size>
> </target>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
> </memory>
>
> The <source/> looks like for model='nvdimm', except <pmem/> is
> required - this is to allow for future expansion when the
> model='virtio' without <pmem/> will describe virtio-mem (which is
> a different device).
This is not a good design. You use a property of <source> which
describes the backend to influence the model of the device, which is
definitely a frontend property as witnessed also by the ABI stability
check:
[I'm moving some hunks around for locality of my point]
> @@ -24223,6 +24292,21 @@ virDomainMemoryDefCheckABIStability(virDomainMemoryDefPtr src,
> _("Target NVDIMM UUID doesn't match source NVDIMM"));
> return false;
> }
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
> + if (src->s.virtio.pmem != dst->s.virtio.pmem) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("Target NVDIMM pmem flag doesn't match "
> + "source NVDIMM pmem flag"));
(also please error messages with no line breaks)
> + return false;
> + }
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NONE:
> + case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> + case VIR_DOMAIN_MEMORY_MODEL_LAST:
> + break;
> }
>
> return virDomainDeviceInfoCheckABIStability(&src->info, &dst->info);
> @@ -6721,6 +6742,39 @@ virDomainMemoryDefValidate(const virDomainMemoryDef *mem,
> _("label size is required for NVDIMM device"));
> return -1;
> }
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
> + if (mem->s.virtio.pmem) {
> + if (!mem->s.virtio.path) {
> + virReportError(VIR_ERR_XML_DETAIL, "%s",
> + _("path is required for model 'virtio'"));
> + return -1;
> + }
> +
> + if (mem->access != VIR_DOMAIN_MEMORY_ACCESS_SHARED) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("shared access mode required for virtio-pmem device"));
> + return -1;
> + }
> +
> + if (mem->targetNode != -1) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("virtio-pmem does not support NUMA nodes"));
> + return -1;
> + }
> + } else {
> + /* TODO: plain virtio-mem behaves differently then virtio-pmem */
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("virtio-mem is not supported yet. <pmem/> is required"));
> + return -1;
This also doesn't seem to inspire confidence in the design.
I think we'll need to use 'virtio-mem' and 'virtio-pmem' for the models.
> + }
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NONE:
> + case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> + case VIR_DOMAIN_MEMORY_MODEL_LAST:
> + break;
> }
>
> return 0;
[...]
> Another difference between NVDIMM and virtio-pmem is that while
> the former supports NUMA node locality the latter doesn't. And
> also, the latter goes onto PCI bus and not into a DIMM module.
I think you've used enough arguments to avoid starting the commit
message that it's a variant of nvdimm :). It's similar in usage, but
definitely very dissimilar in implementation.
[...]
> @@ -5373,15 +5377,31 @@ static int
> virDomainMemoryDefPostParse(virDomainMemoryDefPtr mem,
> const virDomainDef *def)
> {
> - /* Although only the QEMU driver implements PPC64 support, this
> - * code is related to the platform specification (PAPR), i.e. it
> - * is hypervisor agnostic, and any future PPC64 hypervisor driver
> - * will have the same restriction.
> - */
> - if (ARCH_IS_PPC64(def->os.arch) &&
> - mem->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM &&
> - virDomainNVDimmAlignSizePseries(mem) < 0)
> - return -1;
> + switch (mem->model) {
> + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> + /* Although only the QEMU driver implements PPC64 support, this
> + * code is related to the platform specification (PAPR), i.e. it
> + * is hypervisor agnostic, and any future PPC64 hypervisor driver
> + * will have the same restriction.
> + */
> + if (ARCH_IS_PPC64(def->os.arch) &&
> + virDomainNVDimmAlignSizePseries(mem) < 0)
> + return -1;
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
> + /* Virtio-pmem mandates shared access so that guest writes get
> + * reflected in the underlying file. */
I'd expect a check that 'access' is not
'VIR_DOMAIN_MEMORY_ACCESS_PRIVATE' explicitly if it's mandatory to use
shared access.
> + if (mem->s.virtio.pmem &&
> + mem->access == VIR_DOMAIN_MEMORY_ACCESS_DEFAULT)
> + mem->access = VIR_DOMAIN_MEMORY_ACCESS_SHARED;
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> + case VIR_DOMAIN_MEMORY_MODEL_NONE:
> + case VIR_DOMAIN_MEMORY_MODEL_LAST:
> + break;
> + }
>
> return 0;
> }
[...]
[...]
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index e705e8d8d5..ab7938a355 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -8485,6 +8485,8 @@ static int
> qemuDomainDefValidateMemoryHotplugDevice(const virDomainMemoryDef *mem,
> const virDomainDef *def)
> {
> + bool needsNuma = true;
> +
> switch (mem->model) {
> case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> @@ -8520,11 +8522,35 @@ qemuDomainDefValidateMemoryHotplugDevice(const virDomainMemoryDef *mem,
> }
> break;
>
> + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
> + if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI &&
> + mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("only 'pci' addresses are supported for the "
> + "virtio-pmem device"));
> + return -1;
> + }
> +
> + /* virtio-pmem doesn't have .node attribute. */
> + if (mem->s.virtio.pmem)
> + needsNuma = false;
> + break;
> +
> case VIR_DOMAIN_MEMORY_MODEL_NONE:
> case VIR_DOMAIN_MEMORY_MODEL_LAST:
> return -1;
> }
>
> + if (needsNuma &&
> + virDomainNumaGetNodeCount(def->numa) != 0) {
> + if (mem->targetNode == -1) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("target NUMA node needs to be specified for "
> + "memory device"));
> + return -1;
> + }
> + }
> +
> return 0;
> }
>
> diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c
> index d872f75b38..bcf6724da3 100644
> --- a/src/qemu/qemu_domain_address.c
> +++ b/src/qemu/qemu_domain_address.c
> @@ -313,11 +313,11 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def,
> virDomainDeviceAddressType type)
> {
> /*
> - declare address-less virtio devices to be of address type 'type'
> - disks, networks, videos, consoles, controllers, memballoon and rng
> - in this order
> - if type is ccw filesystem and vsock devices are declared to be of
> - address type ccw
> + Declare address-less virtio devices to be of address type 'type'
> + disks, networks, videos, consoles, controllers, hostdevs, memballoon,
> + rngs and memories in this order.
> + If type is ccw filesystem and vsock devices are declared to be of
> + address type ccw.
> */
> size_t i;
>
> @@ -379,6 +379,12 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def,
> def->rngs[i]->info.type = type;
> }
>
> + for (i = 0; i < def->nmems; i++) {
> + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_VIRTIO &&
> + def->mems[i]->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE)
> + def->mems[i]->info.type = type;
> + }
> +
> if (type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW) {
> for (i = 0; i < def->nfss; i++) {
> if (def->fss[i]->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE)
> @@ -1040,11 +1046,23 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev,
> }
> break;
>
> + case VIR_DOMAIN_DEVICE_MEMORY:
> + switch (dev->data.memory->model) {
> + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
> + return virtioFlags;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NONE:
> + case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> + case VIR_DOMAIN_MEMORY_MODEL_LAST:
> + return 0;
> + }
> + break;
> +
> /* These devices don't ever connect with PCI */
> case VIR_DOMAIN_DEVICE_NVRAM:
> case VIR_DOMAIN_DEVICE_TPM:
> case VIR_DOMAIN_DEVICE_PANIC:
> - case VIR_DOMAIN_DEVICE_MEMORY:
> case VIR_DOMAIN_DEVICE_HUB:
> case VIR_DOMAIN_DEVICE_REDIRDEV:
> case VIR_DOMAIN_DEVICE_SMARTCARD:
> @@ -2455,6 +2473,17 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def,
> return -1;
> }
>
> + for (i = 0; i < def->nmems; i++) {
> + virDomainMemoryDefPtr mem = def->mems[i];
> +
> + if (mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO ||
> + !virDeviceInfoPCIAddressIsWanted(&mem->info))
> + continue;
> +
> + if (qemuDomainPCIAddressReserveNextAddr(addrs, &mem->info) < 0)
> + return -1;
> + }
> +
> return 0;
> }
>
> @@ -3102,19 +3131,32 @@ qemuAssignMemoryDeviceSlot(virDomainMemoryDefPtr mem,
>
>
> int
> -qemuDomainAssignMemoryDeviceSlot(virDomainDefPtr def,
> +qemuDomainAssignMemoryDeviceSlot(virQEMUDriverPtr driver,
> + virDomainObjPtr vm,
> virDomainMemoryDefPtr mem)
> {
> - virBitmapPtr slotmap = NULL;
> - int ret;
> + g_autoptr(virBitmap) slotmap = NULL;
> + virDomainDeviceDef dev = {.type = VIR_DOMAIN_DEVICE_MEMORY, .data.memory = mem};
>
> - if (!(slotmap = qemuDomainGetMemorySlotMap(def)))
> - return -1;
> + switch (mem->model) {
> + case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> + if (!(slotmap = qemuDomainGetMemorySlotMap(vm->def)))
> + return -1;
>
> - ret = qemuAssignMemoryDeviceSlot(mem, slotmap);
> + return qemuAssignMemoryDeviceSlot(mem, slotmap);
> + break;
>
> - virBitmapFree(slotmap);
> - return ret;
> + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
> + return qemuDomainEnsurePCIAddress(vm, &dev, driver);
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NONE:
> + case VIR_DOMAIN_MEMORY_MODEL_LAST:
> + break;
> + }
> +
> + return 0;
> }
>
>
> @@ -3132,8 +3174,22 @@ qemuDomainAssignMemorySlots(virDomainDefPtr def)
> return -1;
>
> for (i = 0; i < def->nmems; i++) {
> - if (qemuAssignMemoryDeviceSlot(def->mems[i], slotmap) < 0)
> - goto cleanup;
> + virDomainMemoryDefPtr mem = def->mems[i];
> +
> + switch (mem->model) {
> + case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> + if (qemuAssignMemoryDeviceSlot(def->mems[i], slotmap) < 0)
> + goto cleanup;
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO:
> + /* handled in qemuDomainAssignPCIAddresses() */
> + break;
> + case VIR_DOMAIN_MEMORY_MODEL_NONE:
> + case VIR_DOMAIN_MEMORY_MODEL_LAST:
> + break;
> + }
> }
Preferably this commit will not introduce actual code to the qemu
driver. Please do this refactor separately, before introduction of the
new model. This applies to all of the above hunks.
© 2016 - 2026 Red Hat, Inc.