From nobody Sun May 5 02:51:40 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1555894672; cv=none; d=zoho.com; s=zohoarc; b=c/xiy5RY9eQrAWnJhJ7YeFsoi06ZqaswjjyQoOoOX6U2T4VYJoYBrgJvQZbfpmFE2lnZnjrAyAVtk5OHs7zdpQTnGL6JKf6x4Sd3Qtum88EVAxUTxkaD4Vvzb5bE7qVBJxbk4hOPAaUvUHEu+xLq4lcTjwGIae3PoZPatJCDnK8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555894672; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=vDd1NCXUo2ZLmTrRrob4eFVLLUuj7hgR72teeLYWky8=; b=IBEYo4yl/X0CH7u4YrGk2I02c/fFy0UaQ0dOYPYoYscxELZNe2RkdoofyF0rPhAwhlk1pi8usOODo0U+9yePvQcbq07p/qloSeIlQ+Tt1dIOpAATNCBUyfaLqLibEa5rZB1xwQySJPVOvtX9j14ses+HkaEYYxwGuUiTMKEFSS4= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555894672276588.3454455585702; Sun, 21 Apr 2019 17:57:52 -0700 (PDT) Received: from localhost ([127.0.0.1]:58843 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hINHJ-00018S-E8 for importer@patchew.org; Sun, 21 Apr 2019 20:57:49 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41627) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hINEd-0007lQ-MW for qemu-devel@nongnu.org; Sun, 21 Apr 2019 20:55:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hINEc-0003dy-K2 for qemu-devel@nongnu.org; Sun, 21 Apr 2019 20:55:03 -0400 Received: from mga05.intel.com ([192.55.52.43]:45606) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hINEc-0003ci-CM for qemu-devel@nongnu.org; Sun, 21 Apr 2019 20:55:02 -0400 Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Apr 2019 17:55:00 -0700 Received: from richard.sh.intel.com (HELO localhost) ([10.239.159.54]) by orsmga004.jf.intel.com with ESMTP; 21 Apr 2019 17:50:00 -0700 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,379,1549958400"; d="scan'208";a="293416722" From: Wei Yang To: qemu-devel@nongnu.org Date: Mon, 22 Apr 2019 08:48:48 +0800 Message-Id: <20190422004849.26463-2-richardw.yang@linux.intel.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20190422004849.26463-1-richardw.yang@linux.intel.com> References: <20190422004849.26463-1-richardw.yang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.43 Subject: [Qemu-devel] [PATCH v14 1/2] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pagupta@redhat.com, xiaoguangrong.eric@gmail.com, mst@redhat.com, Haozhong Zhang , yi.z.zhang@linux.intel.com, yu.c.zhang@linux.intel.com, richardw.yang@linux.intel.com, stefanha@redhat.com, imammedo@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, ehabkost@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Zhang Yi When a file supporting DAX is used as vNVDIMM backend, mmap it with MAP_SYNC flag in addition which can ensure file system metadata synced in each guest writes to the backend file, without other QEMU actions (e.g., periodic fsync() by QEMU). Current, We have below different possible use cases: 1. pmem=3Don is set, shared=3Don is set, MAP_SYNC supported: a: backend is a dax supporting file. - MAP_SYNC will active. b: backend is not a dax supporting file. - mmap will trigger a warning. then MAP_SYNC flag will be ignored 2. The rest of cases: - we will never pass the MAP_SYNC to mmap2 Signed-off-by: Haozhong Zhang Signed-off-by: Zhang Yi [ehabkost: Rebased patch to latest code on master] Signed-off-by: Eduardo Habkost Signed-off-by: Wei Yang Tested-by: Wei Yang Reviewed-by: Michael S. Tsirkin Reviewed-by: Stefan Hajnoczi --- v14: rebase on top of current upstream --- util/mmap-alloc.c | 41 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c index 9713f4b960..f7f177d0ea 100644 --- a/util/mmap-alloc.c +++ b/util/mmap-alloc.c @@ -10,6 +10,13 @@ * later. See the COPYING file in the top-level directory. */ =20 +#ifdef CONFIG_LINUX +#include +#else /* !CONFIG_LINUX */ +#define MAP_SYNC 0x0 +#define MAP_SHARED_VALIDATE 0x0 +#endif /* CONFIG_LINUX */ + #include "qemu/osdep.h" #include "qemu/mmap-alloc.h" #include "qemu/host-utils.h" @@ -82,6 +89,7 @@ void *qemu_ram_mmap(int fd, bool is_pmem) { int flags; + int map_sync_flags =3D 0; int guardfd; size_t offset; size_t pagesize; @@ -132,9 +140,40 @@ void *qemu_ram_mmap(int fd, flags =3D MAP_FIXED; flags |=3D fd =3D=3D -1 ? MAP_ANONYMOUS : 0; flags |=3D shared ? MAP_SHARED : MAP_PRIVATE; + if (shared && is_pmem) { + map_sync_flags =3D MAP_SYNC | MAP_SHARED_VALIDATE; + } + offset =3D QEMU_ALIGN_UP((uintptr_t)guardptr, align) - (uintptr_t)guar= dptr; =20 - ptr =3D mmap(guardptr + offset, size, PROT_READ | PROT_WRITE, flags, f= d, 0); + ptr =3D mmap(guardptr + offset, size, PROT_READ | PROT_WRITE, + flags | map_sync_flags, fd, 0); + + if (ptr =3D=3D MAP_FAILED && map_sync_flags) { + if (errno =3D=3D ENOTSUP) { + char *proc_link, *file_name; + int len; + proc_link =3D g_strdup_printf("/proc/self/fd/%d", fd); + file_name =3D g_malloc0(PATH_MAX); + len =3D readlink(proc_link, file_name, PATH_MAX - 1); + if (len < 0) { + len =3D 0; + } + file_name[len] =3D '\0'; + fprintf(stderr, "Warning: requesting persistence across crashe= s " + "for backend file %s failed. Proceeding without " + "persistence, data might become corrupted in case of h= ost " + "crash.\n", file_name); + g_free(proc_link); + g_free(file_name); + } + /* + * if map failed with MAP_SHARED_VALIDATE | MAP_SYNC, + * we will remove these flags to handle compatibility. + */ + ptr =3D mmap(guardptr + offset, size, PROT_READ | PROT_WRITE, + flags, fd, 0); + } =20 if (ptr =3D=3D MAP_FAILED) { munmap(guardptr, total); --=20 2.19.1 From nobody Sun May 5 02:51:40 2024 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1555894602; cv=none; d=zoho.com; s=zohoarc; b=oR+m9oq0vDOYWCYInv909f0LhHr1mFJE4dDtF1udgXiO1pwvFIYaPWQmKCeFsVZGgtOWzfEBGwZiMP3OvsW1wuFCxIJH6WEKMOhb3Tit++0o8k5iKYcfAZIbGVjaNCpv4/fH49EtULzWeaHzqgZpJgk9ftVxOLCQAscUu/SZyOo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555894602; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=w+ER486iQOH5PSrlV6bvMSA9+35KmzXmZKaoobIRXIg=; b=D7ozMQRb0sHefojxboRYnUPPM/wSgJMAaMuQgLGax5baw2+wgx8zUmA1zalgTq6HyhUCz4CnlPXQZQql9TNWjuadcwG44PJy8iUlSlznG6DPldgMw49WPyQl0qJpjNc4K4fDpsZjVNKDKKXe17qDWICuT6aKAlZdMZYbZvtEcy0= ARC-Authentication-Results: i=1; mx.zoho.com; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (209.51.188.17 [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555894602613377.59594745368963; Sun, 21 Apr 2019 17:56:42 -0700 (PDT) Received: from localhost ([127.0.0.1]:58831 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hINFu-00007S-VQ for importer@patchew.org; Sun, 21 Apr 2019 20:56:23 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41617) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hINEd-0007lO-4o for qemu-devel@nongnu.org; Sun, 21 Apr 2019 20:55:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hINEc-0003dg-0U for qemu-devel@nongnu.org; Sun, 21 Apr 2019 20:55:03 -0400 Received: from mga18.intel.com ([134.134.136.126]:34181) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hINEb-0003cj-Og for qemu-devel@nongnu.org; Sun, 21 Apr 2019 20:55:01 -0400 Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Apr 2019 17:55:00 -0700 Received: from richard.sh.intel.com (HELO localhost) ([10.239.159.54]) by orsmga004.jf.intel.com with ESMTP; 21 Apr 2019 17:50:03 -0700 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,379,1549958400"; d="scan'208";a="293416737" From: Wei Yang To: qemu-devel@nongnu.org Date: Mon, 22 Apr 2019 08:48:49 +0800 Message-Id: <20190422004849.26463-3-richardw.yang@linux.intel.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20190422004849.26463-1-richardw.yang@linux.intel.com> References: <20190422004849.26463-1-richardw.yang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.126 Subject: [Qemu-devel] [PATCH v14 2/2] docs: Added MAP_SYNC documentation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pagupta@redhat.com, xiaoguangrong.eric@gmail.com, mst@redhat.com, yi.z.zhang@linux.intel.com, yu.c.zhang@linux.intel.com, richardw.yang@linux.intel.com, stefanha@redhat.com, imammedo@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, ehabkost@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Zhang Yi Signed-off-by: Zhang Yi Reviewed-by: Michael S. Tsirkin Reviewed-by: Pankaj Gupta Reviewed-by: Stefan Hajnoczi --- docs/nvdimm.txt | 22 +++++++++++++++++++--- qemu-options.hx | 5 +++++ 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt index 7231c2d78f..bcd1456e72 100644 --- a/docs/nvdimm.txt +++ b/docs/nvdimm.txt @@ -144,9 +144,25 @@ Guest Data Persistence ---------------------- =20 Though QEMU supports multiple types of vNVDIMM backends on Linux, -currently the only one that can guarantee the guest write persistence -is the device DAX on the real NVDIMM device (e.g., /dev/dax0.0), to -which all guest access do not involve any host-side kernel cache. +the only backend that can guarantee the guest write persistence is: + +A. DAX device (e.g., /dev/dax0.0, ) or +B. DAX file(mounted with dax option) + +When using B (A file supporting direct mapping of persistent memory) +as a backend, write persistence is guaranteed if the host kernel has +support for the MAP_SYNC flag in the mmap system call (available +since Linux 4.15 and on certain distro kernels) and additionally +both 'pmem' and 'share' flags are set to 'on' on the backend. + +If these conditions are not satisfied i.e. if either 'pmem' or 'share' +are not set, if the backend file does not support DAX or if MAP_SYNC +is not supported by the host kernel, write persistence is not +guaranteed after a system crash. For compatibility reasons, these +conditions are silently ignored if not satisfied. Currently, no way +is provided to test for them. +For more details, please reference mmap(2) man page: +http://man7.org/linux/man-pages/man2/mmap.2.html. =20 When using other types of backends, it's suggested to set 'unarmed' option of '-device nvdimm' to 'on', which sets the unarmed flag of the diff --git a/qemu-options.hx b/qemu-options.hx index 08749a3391..bdc74c0620 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -4233,6 +4233,11 @@ using the SNIA NVM programming model (e.g. Intel NVD= IMM). If @option{pmem} is set to 'on', QEMU will take necessary operations to guarantee the persistence of its own writes to @option{mem-path} (e.g. in vNVDIMM label emulation and live migration). +Also, we will map the backend-file with MAP_SYNC flag, which ensures the +file metadata is in sync for @option{mem-path} in case of host crash +or a power failure. MAP_SYNC requires support from both the host kernel +(since Linux kernel 4.15) and the filesystem of @option{mem-path} mounted +with DAX option. =20 @item -object memory-backend-ram,id=3D@var{id},merge=3D@var{on|off},dump= =3D@var{on|off},share=3D@var{on|off},prealloc=3D@var{on|off},size=3D@var{si= ze},host-nodes=3D@var{host-nodes},policy=3D@var{default|preferred|bind|inte= rleave} =20 --=20 2.19.1