backends/hostmem-file.c | 1 + include/exec/cpu-common.h | 1 + include/exec/memory.h | 3 +++ migration/ram.c | 3 ++- softmmu/physmem.c | 7 ++++++- 5 files changed, 13 insertions(+), 2 deletions(-)
migrate_ignore_shared() is an optimization that avoids copying memory
that is visible and can be mapped on the target. However, a
memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED
flag set is not migrated when migrate_ignore_shared() is true. This is
wrong, because the block has no named backing store, and its contents will
be lost. To fix, ignore shared memory iff it is a named file. Define a
new flag RAM_NAMED_FILE to distinguish this case.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
backends/hostmem-file.c | 1 +
include/exec/cpu-common.h | 1 +
include/exec/memory.h | 3 +++
migration/ram.c | 3 ++-
softmmu/physmem.c | 7 ++++++-
5 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 2514128..486d77d 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -56,6 +56,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
ram_flags = backend->share ? RAM_SHARED : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
+ ram_flags |= RAM_NAMED_FILE;
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
backend->size, fb->align, ram_flags,
fb->mem_path, fb->readonly, errp);
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 6feaa40..8173e3b 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -92,6 +92,7 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb);
bool qemu_ram_is_migratable(RAMBlock *rb);
void qemu_ram_set_migratable(RAMBlock *rb);
void qemu_ram_unset_migratable(RAMBlock *rb);
+bool qemu_ram_is_named_file(RAMBlock *rb);
int qemu_ram_get_fd(RAMBlock *rb);
size_t qemu_ram_pagesize(RAMBlock *block);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 2e602a2..3224e09 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -232,6 +232,9 @@ typedef struct IOMMUTLBEvent {
/* RAM that isn't accessible through normal means. */
#define RAM_PROTECTED (1 << 8)
+/* RAM is an mmap-ed named file */
+#define RAM_NAMED_FILE (1 << 9)
+
static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
IOMMUNotifierFlag flags,
hwaddr start, hwaddr end,
diff --git a/migration/ram.c b/migration/ram.c
index 334309f..03f8131 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -190,7 +190,8 @@ static bool postcopy_preempt_active(void)
bool ramblock_is_ignored(RAMBlock *block)
{
return !qemu_ram_is_migratable(block) ||
- (migrate_ignore_shared() && qemu_ram_is_shared(block));
+ (migrate_ignore_shared() && qemu_ram_is_shared(block) &&
+ qemu_ram_is_named_file(block));
}
#undef RAMBLOCK_FOREACH
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index cb998cd..767a5f0 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1749,6 +1749,11 @@ void qemu_ram_unset_migratable(RAMBlock *rb)
rb->flags &= ~RAM_MIGRATABLE;
}
+bool qemu_ram_is_named_file(RAMBlock *rb)
+{
+ return rb->flags & RAM_NAMED_FILE;
+}
+
int qemu_ram_get_fd(RAMBlock *rb)
{
return rb->fd;
@@ -2059,7 +2064,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
/* Just support these ram flags by now. */
assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_NORESERVE |
- RAM_PROTECTED)) == 0);
+ RAM_PROTECTED | RAM_NAMED_FILE)) == 0);
if (xen_enabled()) {
error_setg(errp, "-mem-path not supported with Xen");
--
1.8.3.1
On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: > migrate_ignore_shared() is an optimization that avoids copying memory > that is visible and can be mapped on the target. However, a > memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED > flag set is not migrated when migrate_ignore_shared() is true. This is > wrong, because the block has no named backing store, and its contents will > be lost. To fix, ignore shared memory iff it is a named file. Define a > new flag RAM_NAMED_FILE to distinguish this case. There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems it can still be used in similar ways. Pasting commit message from c6c0232: Because of its unique requirements, Linux manages EPC separately from normal memory. Similar to memfd, the device /dev/sgx_vepc can be opened to obtain a file descriptor which can in turn be used to mmap() EPC memory. I'm not sure whether it means that should apply for RAM_NAMED_FILE too, neither do I think it's super important.. Still better to define it properly. Another comment is, AFAIK this patch will modify senamtics of the old capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth directly modify qapi/migration.json to reflect it (especially it's x- prefixed) to something like: # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory # (since 4.0) Thanks, -- Peter Xu
On 2/7/2023 3:23 PM, Peter Xu wrote: > On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: >> migrate_ignore_shared() is an optimization that avoids copying memory >> that is visible and can be mapped on the target. However, a >> memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED >> flag set is not migrated when migrate_ignore_shared() is true. This is >> wrong, because the block has no named backing store, and its contents will >> be lost. To fix, ignore shared memory iff it is a named file. Define a >> new flag RAM_NAMED_FILE to distinguish this case. > > There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems > it can still be used in similar ways. Pasting commit message from c6c0232: > > Because of its unique requirements, Linux manages EPC separately from > normal memory. Similar to memfd, the device /dev/sgx_vepc can be > opened to obtain a file descriptor which can in turn be used to mmap() > EPC memory. > > I'm not sure whether it means that should apply for RAM_NAMED_FILE too, > neither do I think it's super important.. Still better to define it > properly. The RAM_NAMED_FILE flag will be false for TYPE_MEMORY_BACKEND_EPC, so ramblock_is_ignored will return false, and the contents will be copied over the socket to the target, and the segment will be recreated there. However, perhaps I do not understand your point. > Another comment is, AFAIK this patch will modify senamtics of the old > capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth > directly modify qapi/migration.json to reflect it (especially it's x- > prefixed) to something like: > > # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory > # (since 4.0) Good idea. I propose: # @x-ignore-shared: If enabled, QEMU will not migrate shared memory that is # accessible on the target. (since 4.0) - Steve
On Wed, Feb 08, 2023 at 01:34:18PM -0500, Steven Sistare wrote: > On 2/7/2023 3:23 PM, Peter Xu wrote: > > On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: > >> migrate_ignore_shared() is an optimization that avoids copying memory > >> that is visible and can be mapped on the target. However, a > >> memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED > >> flag set is not migrated when migrate_ignore_shared() is true. This is > >> wrong, because the block has no named backing store, and its contents will > >> be lost. To fix, ignore shared memory iff it is a named file. Define a > >> new flag RAM_NAMED_FILE to distinguish this case. > > > > There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems > > it can still be used in similar ways. Pasting commit message from c6c0232: > > > > Because of its unique requirements, Linux manages EPC separately from > > normal memory. Similar to memfd, the device /dev/sgx_vepc can be > > opened to obtain a file descriptor which can in turn be used to mmap() > > EPC memory. > > > > I'm not sure whether it means that should apply for RAM_NAMED_FILE too, > > neither do I think it's super important.. Still better to define it > > properly. > > The RAM_NAMED_FILE flag will be false for TYPE_MEMORY_BACKEND_EPC, so > ramblock_is_ignored will return false, and the contents will be copied > over the socket to the target, and the segment will be recreated there. > > However, perhaps I do not understand your point. My point was it looked like it should apply RAM_NAMED_FILE too, because it's also a named file. But.. I don't think another QEMU can share the same data if opening the same file. Based on my zero knowledge on it... I quickly looked up sgx_vepc_open() in the Linux impl where sgx_vepc.page_array plays a role of page cache iiuc, while it's private per vma even if VM_SHARED. So please ignore my comment.. > > > Another comment is, AFAIK this patch will modify senamtics of the old > > capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth > > directly modify qapi/migration.json to reflect it (especially it's x- > > prefixed) to something like: > > > > # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory > > # (since 4.0) > > Good idea. I propose: > > # @x-ignore-shared: If enabled, QEMU will not migrate shared memory that is > # accessible on the target. (since 4.0) I would use s/target/destination host/ because target can be misread as the VM itself (at least a general term in migration code base). Other than that looks good to me. Thanks, -- Peter Xu
On 2/8/2023 2:22 PM, Peter Xu wrote: > On Wed, Feb 08, 2023 at 01:34:18PM -0500, Steven Sistare wrote: >> On 2/7/2023 3:23 PM, Peter Xu wrote: >>> On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: >>>> migrate_ignore_shared() is an optimization that avoids copying memory >>>> that is visible and can be mapped on the target. However, a >>>> memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED >>>> flag set is not migrated when migrate_ignore_shared() is true. This is >>>> wrong, because the block has no named backing store, and its contents will >>>> be lost. To fix, ignore shared memory iff it is a named file. Define a >>>> new flag RAM_NAMED_FILE to distinguish this case. >>> >>> There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems >>> it can still be used in similar ways. Pasting commit message from c6c0232: >>> >>> Because of its unique requirements, Linux manages EPC separately from >>> normal memory. Similar to memfd, the device /dev/sgx_vepc can be >>> opened to obtain a file descriptor which can in turn be used to mmap() >>> EPC memory. >>> >>> I'm not sure whether it means that should apply for RAM_NAMED_FILE too, >>> neither do I think it's super important.. Still better to define it >>> properly. >> >> The RAM_NAMED_FILE flag will be false for TYPE_MEMORY_BACKEND_EPC, so >> ramblock_is_ignored will return false, and the contents will be copied >> over the socket to the target, and the segment will be recreated there. >> >> However, perhaps I do not understand your point. > > My point was it looked like it should apply RAM_NAMED_FILE too, because > it's also a named file. > > But.. I don't think another QEMU can share the same data if opening the > same file. Based on my zero knowledge on it... I quickly looked up > sgx_vepc_open() in the Linux impl where sgx_vepc.page_array plays a role of > page cache iiuc, while it's private per vma even if VM_SHARED. > > So please ignore my comment.. > >> >>> Another comment is, AFAIK this patch will modify senamtics of the old >>> capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth >>> directly modify qapi/migration.json to reflect it (especially it's x- >>> prefixed) to something like: >>> >>> # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory >>> # (since 4.0) >> >> Good idea. I propose: >> >> # @x-ignore-shared: If enabled, QEMU will not migrate shared memory that is >> # accessible on the target. (since 4.0) > > I would use s/target/destination host/ because target can be misread as the > VM itself (at least a general term in migration code base). Other than > that looks good to me. Hi Peter, I will rebase to the tip, modify the wording, and repost. Can I add your RB? - Steve
On Wed, Jun 07, 2023 at 10:05:11AM -0400, Steven Sistare wrote: > On 2/8/2023 2:22 PM, Peter Xu wrote: > > On Wed, Feb 08, 2023 at 01:34:18PM -0500, Steven Sistare wrote: > >> On 2/7/2023 3:23 PM, Peter Xu wrote: > >>> On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: > >>>> migrate_ignore_shared() is an optimization that avoids copying memory > >>>> that is visible and can be mapped on the target. However, a > >>>> memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED > >>>> flag set is not migrated when migrate_ignore_shared() is true. This is > >>>> wrong, because the block has no named backing store, and its contents will > >>>> be lost. To fix, ignore shared memory iff it is a named file. Define a > >>>> new flag RAM_NAMED_FILE to distinguish this case. > >>> > >>> There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems > >>> it can still be used in similar ways. Pasting commit message from c6c0232: > >>> > >>> Because of its unique requirements, Linux manages EPC separately from > >>> normal memory. Similar to memfd, the device /dev/sgx_vepc can be > >>> opened to obtain a file descriptor which can in turn be used to mmap() > >>> EPC memory. > >>> > >>> I'm not sure whether it means that should apply for RAM_NAMED_FILE too, > >>> neither do I think it's super important.. Still better to define it > >>> properly. > >> > >> The RAM_NAMED_FILE flag will be false for TYPE_MEMORY_BACKEND_EPC, so > >> ramblock_is_ignored will return false, and the contents will be copied > >> over the socket to the target, and the segment will be recreated there. > >> > >> However, perhaps I do not understand your point. > > > > My point was it looked like it should apply RAM_NAMED_FILE too, because > > it's also a named file. > > > > But.. I don't think another QEMU can share the same data if opening the > > same file. Based on my zero knowledge on it... I quickly looked up > > sgx_vepc_open() in the Linux impl where sgx_vepc.page_array plays a role of > > page cache iiuc, while it's private per vma even if VM_SHARED. > > > > So please ignore my comment.. > > > >> > >>> Another comment is, AFAIK this patch will modify senamtics of the old > >>> capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth > >>> directly modify qapi/migration.json to reflect it (especially it's x- > >>> prefixed) to something like: > >>> > >>> # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory > >>> # (since 4.0) > >> > >> Good idea. I propose: > >> > >> # @x-ignore-shared: If enabled, QEMU will not migrate shared memory that is > >> # accessible on the target. (since 4.0) > > > > I would use s/target/destination host/ because target can be misread as the > > VM itself (at least a general term in migration code base). Other than > > that looks good to me. > > Hi Peter, I will rebase to the tip, modify the wording, and repost. > Can I add your RB? Yes. thanks, -- Peter Xu
© 2016 - 2024 Red Hat, Inc.