From nobody Sun May 5 04:02:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1515680831985703.6709765075123; Thu, 11 Jan 2018 06:27:11 -0800 (PST) Received: from localhost ([::1]:47087 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZdp1-0005oV-0E for importer@patchew.org; Thu, 11 Jan 2018 09:27:11 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53521) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZdkX-0001tt-5U for qemu-devel@nongnu.org; Thu, 11 Jan 2018 09:22:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eZdkS-0004H0-BG for qemu-devel@nongnu.org; Thu, 11 Jan 2018 09:22:33 -0500 Received: from mga11.intel.com ([192.55.52.93]:65248) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eZdkS-0004Ez-3g for qemu-devel@nongnu.org; Thu, 11 Jan 2018 09:22:28 -0500 Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2018 06:22:27 -0800 Received: from hz-desktop.sh.intel.com (HELO localhost) ([10.239.13.35]) by fmsmga007.fm.intel.com with ESMTP; 11 Jan 2018 06:22:25 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,344,1511856000"; d="scan'208";a="8853910" From: Haozhong Zhang To: qemu-devel@nongnu.org Date: Thu, 11 Jan 2018 22:22:07 +0800 Message-Id: <20180111142208.17617-2-haozhong.zhang@intel.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20180111142208.17617-1-haozhong.zhang@intel.com> References: <20180111142208.17617-1-haozhong.zhang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.93 Subject: [Qemu-devel] [PATCH v2 1/2] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Haozhong Zhang , Xiao Guangrong , mst@redhat.com, Stefan Hajnoczi , Paolo Bonzini , Igor Mammedov , Dan Williams , Eduardo Habkost Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" When a file supporting DAX is used as vNVDIMM backend, mmap it with MAP_SYNC flag in addition can guarantee the persistence of guest write to the backend file without other QEMU actions (e.g., periodic fsync() by QEMU). A OnOffAuto parameter 'sync' is added to qemu_ram_mmap(): - If sync =3D=3D ON_OFF_AUTO_ON, qemu_ram_mmap() will try to pass MAP_SYNC to mmap(). It will then fail if the host OS or the backend file do not support MAP_SYNC, or MAP_SYNC is conflict with other flags. - If sync =3D=3D ON_OFF_AUTO_OFF, qemu_ram_mmap() will never pass MAP_SYNC to mmap(). - If sync =3D=3D ON_OFF_AUTO_AUTO, and * if the host OS and the backend file support MAP_SYNC, and MAP_SYNC is not conflict with other flags, qemu_ram_mmap() will work as if sync =3D=3D ON_OFF_AUTO_ON. * otherwise, qemu_ram_mmap() will work as if sync =3D=3D ON_OFF_AUTO_OFF. Signed-off-by: Haozhong Zhang --- exec.c | 2 +- include/qemu/mmap-alloc.h | 3 ++- include/qemu/osdep.h | 16 ++++++++++++++++ util/mmap-alloc.c | 24 ++++++++++++++++++++++-- util/oslib-posix.c | 2 +- 5 files changed, 42 insertions(+), 5 deletions(-) diff --git a/exec.c b/exec.c index 8fba88ae1c..f4254cb6d3 100644 --- a/exec.c +++ b/exec.c @@ -1646,7 +1646,7 @@ static void *file_ram_alloc(RAMBlock *block, } =20 area =3D qemu_ram_mmap(fd, memory, block->mr->align, - block->flags & RAM_SHARED); + block->flags & RAM_SHARED, ON_OFF_AUTO_OFF); if (area =3D=3D MAP_FAILED) { error_setg_errno(errp, errno, "unable to map backing store for guest RAM"); diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h index 50385e3f81..dd5876471f 100644 --- a/include/qemu/mmap-alloc.h +++ b/include/qemu/mmap-alloc.h @@ -7,7 +7,8 @@ size_t qemu_fd_getpagesize(int fd); =20 size_t qemu_mempath_getpagesize(const char *mem_path); =20 -void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared); +void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared, + OnOffAuto sync); =20 void qemu_ram_munmap(void *ptr, size_t size); =20 diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index adb3758275..55637e0724 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -372,6 +372,22 @@ void qemu_anon_ram_free(void *ptr, size_t size); # define QEMU_VMALLOC_ALIGN getpagesize() #endif =20 +/* + * MAP_SHARED_VALIDATE and MAP_SYNC were introduced in Linux kernel + * 4.15, so they may not be defined when compiling on older kernels. + */ +#ifdef CONFIG_LINUX +#ifndef MAP_SHARED_VALIDATE +#define MAP_SHARED_VALIDATE 0x3 +#endif +#ifndef MAP_SYNC +#define MAP_SYNC 0x80000 +#endif +#define QEMU_HAS_MAP_SYNC true +#else /* !CONFIG_LINUX */ +#define QEMU_HAS_MAP_SYNC false +#endif /* CONFIG_LINUX */ + #ifdef CONFIG_POSIX struct qemu_signalfd_siginfo { uint32_t ssi_signo; /* Signal number */ diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c index 2fd8cbcc6f..af57218669 100644 --- a/util/mmap-alloc.c +++ b/util/mmap-alloc.c @@ -73,7 +73,8 @@ size_t qemu_mempath_getpagesize(const char *mem_path) return getpagesize(); } =20 -void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared) +void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared, + OnOffAuto sync) { /* * Note: this always allocates at least one extra page of virtual addr= ess @@ -97,6 +98,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bo= ol shared) #endif size_t offset; void *ptr1; + int xflags =3D 0; =20 if (ptr =3D=3D MAP_FAILED) { return MAP_FAILED; @@ -106,11 +108,29 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align= , bool shared) /* Always align to host page size */ assert(align >=3D getpagesize()); =20 + if (!QEMU_HAS_MAP_SYNC || !shared) { + if (sync =3D=3D ON_OFF_AUTO_ON) { + return MAP_FAILED; + } + sync =3D ON_OFF_AUTO_OFF; + } + if (sync !=3D ON_OFF_AUTO_OFF) { + xflags =3D MAP_SYNC; + } + /* + * If MAP_SHARED_VALIDATE is present, mmap will fail when MAP_SYNC + * is not supported. Otherwise, mmap will just ignore MAP_SYNC when + * it's not supported. + */ + if (sync =3D=3D ON_OFF_AUTO_ON) { + xflags |=3D MAP_SHARED_VALIDATE; + } + offset =3D QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr; ptr1 =3D mmap(ptr + offset, size, PROT_READ | PROT_WRITE, MAP_FIXED | (fd =3D=3D -1 ? MAP_ANONYMOUS : 0) | - (shared ? MAP_SHARED : MAP_PRIVATE), + (shared ? MAP_SHARED : MAP_PRIVATE) | xflags, fd, 0); if (ptr1 =3D=3D MAP_FAILED) { munmap(ptr, total); diff --git a/util/oslib-posix.c b/util/oslib-posix.c index 77369c92ce..ecb1c275d2 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -130,7 +130,7 @@ void *qemu_memalign(size_t alignment, size_t size) void *qemu_anon_ram_alloc(size_t size, uint64_t *alignment) { size_t align =3D QEMU_VMALLOC_ALIGN; - void *ptr =3D qemu_ram_mmap(-1, size, align, false); + void *ptr =3D qemu_ram_mmap(-1, size, align, false, ON_OFF_AUTO_OFF); =20 if (ptr =3D=3D MAP_FAILED) { return NULL; --=20 2.15.1 From nobody Sun May 5 04:02:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1515680671152746.8389702057082; Thu, 11 Jan 2018 06:24:31 -0800 (PST) Received: from localhost ([::1]:46878 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZdmQ-0003JM-5U for importer@patchew.org; Thu, 11 Jan 2018 09:24:30 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53524) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZdkX-0001uF-AH for qemu-devel@nongnu.org; Thu, 11 Jan 2018 09:22:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eZdkT-0004Ie-TN for qemu-devel@nongnu.org; Thu, 11 Jan 2018 09:22:33 -0500 Received: from mga11.intel.com ([192.55.52.93]:65248) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eZdkT-0004Ez-E2 for qemu-devel@nongnu.org; Thu, 11 Jan 2018 09:22:29 -0500 Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2018 06:22:29 -0800 Received: from hz-desktop.sh.intel.com (HELO localhost) ([10.239.13.35]) by fmsmga007.fm.intel.com with ESMTP; 11 Jan 2018 06:22:27 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,344,1511856000"; d="scan'208";a="8853919" From: Haozhong Zhang To: qemu-devel@nongnu.org Date: Thu, 11 Jan 2018 22:22:08 +0800 Message-Id: <20180111142208.17617-3-haozhong.zhang@intel.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20180111142208.17617-1-haozhong.zhang@intel.com> References: <20180111142208.17617-1-haozhong.zhang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.93 Subject: [Qemu-devel] [PATCH v2 2/2] hostmem-file: add 'sync' option X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Haozhong Zhang , Xiao Guangrong , mst@redhat.com, Stefan Hajnoczi , Paolo Bonzini , Igor Mammedov , Dan Williams , Eduardo Habkost Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This option controls whether QEMU mmap(2) the memory backend file with MAP_SYNC flag, which can fully guarantee the guest write persistence to the backend, if MAP_SYNC flag is supported by the host kernel (Linux kernel 4.15 and later) and the backend is a file supporting DAX (e.g., file on ext4/xfs file system mounted with '-o dax'). It can take one of following values: - on: try to pass MAP_SYNC to mmap(2); if MAP_SYNC is not supported or 'share=3Doff', QEMU will abort - off: never pass MAP_SYNC to mmap(2) - auto (default): if MAP_SYNC is supported and 'share=3Don', work as if 'sync=3Don'; otherwise, work as if 'sync=3Doff' Signed-off-by: Haozhong Zhang Sugguested-by: Eduardo Habkost --- backends/hostmem-file.c | 39 ++++++++++++++++++++++++++++++++++++++- docs/nvdimm.txt | 15 ++++++++++++++- exec.c | 13 ++++++++----- include/exec/memory.h | 4 ++++ include/exec/ram_addr.h | 6 +++--- memory.c | 6 ++++-- numa.c | 2 +- qemu-options.hx | 21 ++++++++++++++++++++- 8 files changed, 92 insertions(+), 14 deletions(-) diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c index e319ec1ad8..52fdc0a48a 100644 --- a/backends/hostmem-file.c +++ b/backends/hostmem-file.c @@ -15,6 +15,7 @@ #include "sysemu/hostmem.h" #include "sysemu/sysemu.h" #include "qom/object_interfaces.h" +#include "qapi-visit.h" =20 /* hostmem-file.c */ /** @@ -35,6 +36,7 @@ struct HostMemoryBackendFile { bool discard_data; char *mem_path; uint64_t align; + OnOffAuto sync; }; =20 static void @@ -60,7 +62,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Err= or **errp) memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), path, backend->size, fb->align, fb->share, - fb->mem_path, errp); + fb->sync, fb->mem_path, errp); g_free(path); } #endif @@ -150,6 +152,38 @@ static void file_memory_backend_set_align(Object *o, V= isitor *v, error_propagate(errp, local_err); } =20 +static void file_memory_backend_get_sync( + Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) +{ + HostMemoryBackendFile *fb =3D MEMORY_BACKEND_FILE(obj); + OnOffAuto value =3D fb->sync; + + visit_type_OnOffAuto(v, name, &value, errp); +} + +static void file_memory_backend_set_sync( + Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) +{ + HostMemoryBackend *backend =3D MEMORY_BACKEND(obj); + HostMemoryBackendFile *fb =3D MEMORY_BACKEND_FILE(obj); + Error *local_err =3D NULL; + OnOffAuto value; + + if (host_memory_backend_mr_inited(backend)) { + error_setg(&local_err, "cannot change property value"); + goto out; + } + + visit_type_OnOffAuto(v, name, &value, &local_err); + if (local_err) { + goto out; + } + fb->sync =3D value; + + out: + error_propagate(errp, local_err); +} + static void file_backend_unparent(Object *obj) { HostMemoryBackend *backend =3D MEMORY_BACKEND(obj); @@ -184,6 +218,9 @@ file_backend_class_init(ObjectClass *oc, void *data) file_memory_backend_get_align, file_memory_backend_set_align, NULL, NULL, &error_abort); + object_class_property_add(oc, "sync", "OnOffAuto", + file_memory_backend_get_sync, file_memory_backend_set_sync, + NULL, NULL, &error_abort); } =20 static void file_backend_instance_finalize(Object *o) diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt index e903d8bb09..49b174fe66 100644 --- a/docs/nvdimm.txt +++ b/docs/nvdimm.txt @@ -143,10 +143,23 @@ Guest Data Persistence ---------------------- =20 Though QEMU supports multiple types of vNVDIMM backends on Linux, -currently the only one that can guarantee the guest write persistence +if MAP_SYNC is not supported by the host kernel and the backends, +the only backend that can guarantee the guest write persistence is the device DAX on the real NVDIMM device (e.g., /dev/dax0.0), to which all guest access do not involve any host-side kernel cache. =20 +mmap(2) flag MAP_SYNC is added since Linux kernel 4.15. On such +systems, QEMU can mmap(2) the backend with MAP_SYNC, which can +guarantee the guest write persistence to vNVDIMM. Besides the host +kernel support, enabling MAP_SYNC in QEMU also requires: + + - the backend is a file supporting DAX, e.g., a file on an ext4 or + xfs file system mounted with "-o dax", + + - 'sync' option of memory-backend-file is not 'off', and + + - 'share' option of memory-backend-file is 'on'. + When using other types of backends, it's suggested to set 'unarmed' option of '-device nvdimm' to 'on', which sets the unarmed flag of the guest NVDIMM region mapping structure. This unarmed flag indicates diff --git a/exec.c b/exec.c index f4254cb6d3..ce13f8cb21 100644 --- a/exec.c +++ b/exec.c @@ -1600,6 +1600,7 @@ static void *file_ram_alloc(RAMBlock *block, ram_addr_t memory, int fd, bool truncate, + OnOffAuto sync, Error **errp) { void *area; @@ -1646,7 +1647,7 @@ static void *file_ram_alloc(RAMBlock *block, } =20 area =3D qemu_ram_mmap(fd, memory, block->mr->align, - block->flags & RAM_SHARED, ON_OFF_AUTO_OFF); + block->flags & RAM_SHARED, sync); if (area =3D=3D MAP_FAILED) { error_setg_errno(errp, errno, "unable to map backing store for guest RAM"); @@ -1974,7 +1975,7 @@ static void ram_block_add(RAMBlock *new_block, Error = **errp) =20 #ifdef __linux__ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, - bool share, int fd, + bool share, OnOffAuto sync, int fd, Error **errp) { RAMBlock *new_block; @@ -2017,7 +2018,8 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, Mem= oryRegion *mr, new_block->used_length =3D size; new_block->max_length =3D size; new_block->flags =3D share ? RAM_SHARED : 0; - new_block->host =3D file_ram_alloc(new_block, size, fd, !file_size, er= rp); + new_block->host =3D file_ram_alloc(new_block, size, fd, !file_size, sy= nc, + errp); if (!new_block->host) { g_free(new_block); return NULL; @@ -2035,7 +2037,8 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, Mem= oryRegion *mr, =20 =20 RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, - bool share, const char *mem_path, + bool share, OnOffAuto sync, + const char *mem_path, Error **errp) { int fd; @@ -2047,7 +2050,7 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, M= emoryRegion *mr, return NULL; } =20 - block =3D qemu_ram_alloc_from_fd(size, mr, share, fd, errp); + block =3D qemu_ram_alloc_from_fd(size, mr, share, sync, fd, errp); if (!block) { if (created) { unlink(mem_path); diff --git a/include/exec/memory.h b/include/exec/memory.h index 07c5d6d597..ff3cd583e9 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -468,6 +468,9 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr, * @align: alignment of the region base address; if 0, the default alignme= nt * (getpagesize()) will be used. * @share: %true if memory must be mmaped with the MAP_SHARED flag + * @sync: %ON_OFF_AUTO_ON if memory must be mapped with MAP_SYNC flag; + * %ON_OFF_AUTO_OFF if memory cannot be mapped with MAP_SYNC flag; + * %ON_OFF_AUTO_AUTO directs QEMU to mmap with MAP_SYNC flag if pos= sible * @path: the path in which to allocate the RAM. * @errp: pointer to Error*, to store an error if it happens. * @@ -480,6 +483,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr, uint64_t size, uint64_t align, bool share, + OnOffAuto sync, const char *path, Error **errp); =20 diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 6cbc02aa0f..4494ae9132 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -73,10 +73,10 @@ static inline unsigned long int ramblock_recv_bitmap_of= fset(void *host_addr, long qemu_getrampagesize(void); unsigned long last_ram_page(void); RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, - bool share, const char *mem_path, - Error **errp); + bool share, OnOffAuto sync, + const char *mem_path, Error **errp); RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, - bool share, int fd, + bool share, OnOffAuto sync, int fd, Error **errp); RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, MemoryRegion *mr, Error **errp); diff --git a/memory.c b/memory.c index 449a1429b9..e22f51394e 100644 --- a/memory.c +++ b/memory.c @@ -1572,6 +1572,7 @@ void memory_region_init_ram_from_file(MemoryRegion *m= r, uint64_t size, uint64_t align, bool share, + OnOffAuto sync, const char *path, Error **errp) { @@ -1580,7 +1581,7 @@ void memory_region_init_ram_from_file(MemoryRegion *m= r, mr->terminates =3D true; mr->destructor =3D memory_region_destructor_ram; mr->align =3D align; - mr->ram_block =3D qemu_ram_alloc_from_file(size, mr, share, path, errp= ); + mr->ram_block =3D qemu_ram_alloc_from_file(size, mr, share, sync, path= , errp); mr->dirty_log_mask =3D tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0; } =20 @@ -1596,7 +1597,8 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr, mr->ram =3D true; mr->terminates =3D true; mr->destructor =3D memory_region_destructor_ram; - mr->ram_block =3D qemu_ram_alloc_from_fd(size, mr, share, fd, errp); + mr->ram_block =3D qemu_ram_alloc_from_fd(size, mr, share, ON_OFF_AUTO_= OFF, fd, + errp); mr->dirty_log_mask =3D tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0; } #endif diff --git a/numa.c b/numa.c index 83675a03f3..93180510a4 100644 --- a/numa.c +++ b/numa.c @@ -457,7 +457,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion= *mr, Object *owner, #ifdef __linux__ Error *err =3D NULL; memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, fal= se, - mem_path, &err); + ON_OFF_AUTO_OFF, mem_path, &err); if (err) { error_report_err(err); if (mem_prealloc) { diff --git a/qemu-options.hx b/qemu-options.hx index 5ff741a4af..3ee423e6a8 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -3974,7 +3974,7 @@ property must be set. These objects are placed in the =20 @table @option =20 -@item -object memory-backend-file,id=3D@var{id},size=3D@var{size},mem-path= =3D@var{dir},share=3D@var{on|off},discard-data=3D@var{on|off},merge=3D@var{= on|off},dump=3D@var{on|off},prealloc=3D@var{on|off},host-nodes=3D@var{host-= nodes},policy=3D@var{default|preferred|bind|interleave},align=3D@var{align} +@item -object memory-backend-file,id=3D@var{id},size=3D@var{size},mem-path= =3D@var{dir},share=3D@var{on|off},discard-data=3D@var{on|off},merge=3D@var{= on|off},dump=3D@var{on|off},prealloc=3D@var{on|off},host-nodes=3D@var{host-= nodes},policy=3D@var{default|preferred|bind|interleave},align=3D@var{align}= ,sync=3D@var{on|off|auto} =20 Creates a memory file backend object, which can be used to back the guest RAM with huge pages. @@ -4034,6 +4034,25 @@ requires an alignment different than the default one= used by QEMU, eg the device DAX /dev/dax0.0 requires 2M alignment rather than 4K. In such cases, users can specify the required alignment via this option. =20 +The @option{sync} option specifies whether QEMU mmap(2) @option{mem-path} +with MAP_SYNC flag, which can fully guarantee the guest write +persistence to @option{mem-path}. MAP_SYNC requires supports from both +the host kernel (since Linux kernel 4.15) and @option{mem-path} (only +files supporting DAX). It can take one of following values: + +@table @option +@item @var{on} +try to pass MAP_SYNC to mmap(2); if MAP_SYNC is not supported or +@option{share}=3D@var{off}, QEMU will abort + +@item @var{off} +never pass MAP_SYNC to mmap(2) + +@item @var{auto} (default) +if MAP_SYNC is supported and @option{share}=3D@var{on}, work as if +@option{sync}=3D@var{on}; otherwise, work as if @option{sync}=3D@var{off} +@end table + @item -object memory-backend-ram,id=3D@var{id},merge=3D@var{on|off},dump= =3D@var{on|off},prealloc=3D@var{on|off},size=3D@var{size},host-nodes=3D@var= {host-nodes},policy=3D@var{default|preferred|bind|interleave} =20 Creates a memory backend object, which can be used to back the guest RAM. --=20 2.15.1