From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735104777856.2212247208142; Tue, 21 Feb 2017 19:45:04 -0800 (PST) Received: from localhost ([::1]:49645 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNrT-0005uF-EN for importer@patchew.org; Tue, 21 Feb 2017 22:45:03 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42600) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpO-0004iC-NP for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002uk-R7 for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:54 -0500 Received: from [45.249.212.187] (port=2538 helo=dggrg01-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pk-Rb for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.54 (EHLO DGGEMM401-HUB.china.huawei.com) ([172.30.72.54]) by dggrg01-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AJN08763; Wed, 22 Feb 2017 11:42:40 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM401-HUB.china.huawei.com (10.3.20.209) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:35 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:24 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:02 +0800 Message-ID: <1487734936-43472-2-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.58AD08B2.01F0, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 17f7f23f22a070ba34cdbc983802d11d X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.187 Subject: [Qemu-devel] [PATCH 01/15] net/colo: Add notifier/callback related helpers for filter X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We will use this notifier to help COLO to notify filter object to do something, like do checkpoint, or process failover event. Cc: Jason Wang Signed-off-by: zhanghailiang --- net/colo.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ net/colo.h | 18 ++++++++++++ 2 files changed, 110 insertions(+) diff --git a/net/colo.c b/net/colo.c index 8cc166b..1697150 100644 --- a/net/colo.c +++ b/net/colo.c @@ -15,6 +15,7 @@ #include "qemu/osdep.h" #include "trace.h" #include "net/colo.h" +#include "qapi/error.h" =20 uint32_t connection_key_hash(const void *opaque) { @@ -209,3 +210,94 @@ Connection *connection_get(GHashTable *connection_trac= k_table, =20 return conn; } + +static gboolean +filter_notify_prepare(GSource *source, gint *timeout) +{ + *timeout =3D -1; + + return FALSE; +} + +static gboolean +filter_notify_check(GSource *source) +{ + FilterNotifier *notify =3D (FilterNotifier *)source; + + return notify->pfd.revents & (G_IO_IN | G_IO_HUP | G_IO_ERR); +} + +static gboolean +filter_notify_dispatch(GSource *source, + GSourceFunc callback, + gpointer user_data) +{ + FilterNotifier *notify =3D (FilterNotifier *)source; + int revents; + int ret; + + revents =3D notify->pfd.revents & notify->pfd.events; + if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) { + ret =3D event_notifier_test_and_clear(¬ify->event); + if (notify->cb) { + notify->cb(notify, ret); + } + } + return TRUE; +} + +static void +filter_notify_finalize(GSource *source) +{ + FilterNotifier *notify =3D (FilterNotifier *)source; + + event_notifier_cleanup(¬ify->event); +} + +static GSourceFuncs notifier_source_funcs =3D { + filter_notify_prepare, + filter_notify_check, + filter_notify_dispatch, + filter_notify_finalize, +}; + +FilterNotifier *filter_noitifier_new(FilterNotifierCallback *cb, + void *opaque, Error **errp) +{ + FilterNotifier *notify; + int ret; + + notify =3D (FilterNotifier *)g_source_new(¬ifier_source_funcs, + sizeof(FilterNotifier)); + ret =3D event_notifier_init(¬ify->event, false); + if (ret < 0) { + error_setg_errno(errp, -ret, "Failed to initialize event notifier"= ); + goto fail; + } + notify->pfd.fd =3D event_notifier_get_fd(¬ify->event); + notify->pfd.events =3D G_IO_IN | G_IO_HUP | G_IO_ERR; + notify->cb =3D cb; + notify->opaque =3D opaque; + g_source_add_poll(¬ify->source, ¬ify->pfd); + + return notify; + +fail: + g_source_destroy(¬ify->source); + return NULL; +} + +int filter_notifier_set(FilterNotifier *notify, uint64_t value) +{ + ssize_t ret; + + do { + ret =3D write(notify->event.wfd, &value, sizeof(value)); + } while (ret < 0 && errno =3D=3D EINTR); + + /* EAGAIN is fine, a read must be pending. */ + if (ret < 0 && errno !=3D EAGAIN) { + return -errno; + } + return 0; +} diff --git a/net/colo.h b/net/colo.h index cd9027f..00f03b5 100644 --- a/net/colo.h +++ b/net/colo.h @@ -19,6 +19,7 @@ #include "qemu/jhash.h" #include "qemu/timer.h" #include "slirp/tcp.h" +#include "qemu/event_notifier.h" =20 #define HASHTABLE_MAX_SIZE 16384 =20 @@ -89,4 +90,21 @@ void connection_hashtable_reset(GHashTable *connection_t= rack_table); Packet *packet_new(const void *data, int size); void packet_destroy(void *opaque, void *user_data); =20 +typedef void FilterNotifierCallback(void *opaque, int value); +typedef struct FilterNotifier { + GSource source; + EventNotifier event; + GPollFD pfd; + FilterNotifierCallback *cb; + void *opaque; +} FilterNotifier; + +FilterNotifier *filter_noitifier_new(FilterNotifierCallback *cb, + void *opaque, Error **errp); +int filter_notifier_set(FilterNotifier *notify, uint64_t value); + +enum { + COLO_CHECKPOINT =3D 2, + COLO_FAILOVER, +}; #endif /* QEMU_COLO_PROXY_H */ --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735644459549.3070508105271; Tue, 21 Feb 2017 19:54:04 -0800 (PST) Received: from localhost ([::1]:49689 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgO0A-00064t-OY for importer@patchew.org; Tue, 21 Feb 2017 22:54:02 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42597) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpO-0004iB-Lv for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002uo-QX for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:54 -0500 Received: from [45.249.212.187] (port=2541 helo=dggrg01-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pm-SJ for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.54 (EHLO DGGEMM401-HUB.china.huawei.com) ([172.30.72.54]) by dggrg01-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AJN08768; Wed, 22 Feb 2017 11:42:41 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM401-HUB.china.huawei.com (10.3.20.209) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:35 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:25 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:03 +0800 Message-ID: <1487734936-43472-3-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020204.58AD08B2.0203, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: b34febb9b2af289fe61126f04f4773a1 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.187 Subject: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" While do checkpoint, we need to flush all the unhandled packets, By using the filter notifier mechanism, we can easily to notify every compare object to do this process, which runs inside of compare threads as a coroutine. Cc: Jason Wang Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen --- net/colo-compare.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ net/colo-compare.h | 20 +++++++++++++++ 2 files changed, 92 insertions(+) create mode 100644 net/colo-compare.h diff --git a/net/colo-compare.c b/net/colo-compare.c index a6fc2ff..61a8ee4 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -29,17 +29,24 @@ #include "qemu/sockets.h" #include "qapi-visit.h" #include "net/colo.h" +#include "net/colo-compare.h" =20 #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE) =20 +static QTAILQ_HEAD(, CompareState) net_compares =3D + QTAILQ_HEAD_INITIALIZER(net_compares); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 =20 /* TODO: Should be configurable */ #define REGULAR_PACKET_CHECK_MS 3000 =20 +static QemuMutex event_mtx =3D { .lock =3D PTHREAD_MUTEX_INITIALIZER }; +static QemuCond event_complete_cond =3D { .cond =3D PTHREAD_COND_INITIALIZ= ER }; +static int event_unhandled_count; /* + CompareState ++ | | @@ -86,6 +93,10 @@ typedef struct CompareState { =20 GMainContext *worker_context; GMainLoop *compare_loop; + /* Used for COLO to notify compare to do something */ + FilterNotifier *notifier; + + QTAILQ_ENTRY(CompareState) next; } CompareState; =20 typedef struct CompareClass { @@ -375,6 +386,11 @@ static void colo_compare_connection(void *opaque, void= *user_data) while (!g_queue_is_empty(&conn->primary_list) && !g_queue_is_empty(&conn->secondary_list)) { pkt =3D g_queue_pop_tail(&conn->primary_list); + if (!pkt) { + error_report("colo-compare pop pkt failed"); + return; + } + switch (conn->ip_proto) { case IPPROTO_TCP: result =3D g_queue_find_custom(&conn->secondary_list, @@ -496,6 +512,52 @@ static gboolean check_old_packet_regular(void *opaque) return TRUE; } =20 +/* Public API, Used for COLO frame to notify compare event */ +void colo_notify_compares_event(void *opaque, int event, Error **errp) +{ + CompareState *s; + int ret; + + qemu_mutex_lock(&event_mtx); + QTAILQ_FOREACH(s, &net_compares, next) { + ret =3D filter_notifier_set(s->notifier, event); + if (ret < 0) { + error_setg_errno(errp, -ret, "Failed to write value to eventfd= "); + goto fail; + } + event_unhandled_count++; + } + /* Wait all compare thread to finish handling this event */ + while (event_unhandled_count) { + qemu_cond_wait(&event_complete_cond, &event_mtx); + } + +fail: + qemu_mutex_unlock(&event_mtx); +} + +static void colo_flush_packets(void *opaque, void *user_data); + +static void colo_compare_handle_event(void *opaque, int event) +{ + FilterNotifier *notify =3D opaque; + CompareState *s =3D notify->opaque; + + switch (event) { + case COLO_CHECKPOINT: + g_queue_foreach(&s->conn_list, colo_flush_packets, s); + break; + case COLO_FAILOVER: + break; + default: + break; + } + qemu_mutex_lock(&event_mtx); + event_unhandled_count--; + qemu_cond_broadcast(&event_complete_cond); + qemu_mutex_unlock(&event_mtx); +} + static void *colo_compare_thread(void *opaque) { CompareState *s =3D opaque; @@ -516,8 +578,12 @@ static void *colo_compare_thread(void *opaque) (GSourceFunc)check_old_packet_regular, s, NULL); g_source_attach(timeout_source, s->worker_context); =20 + s->notifier =3D filter_noitifier_new(colo_compare_handle_event, s, NUL= L); + g_source_attach(&s->notifier->source, s->worker_context); + g_main_loop_run(s->compare_loop); =20 + g_source_unref(&s->notifier->source); g_source_unref(timeout_source); g_main_loop_unref(s->compare_loop); g_main_context_unref(s->worker_context); @@ -660,6 +726,8 @@ static void colo_compare_complete(UserCreatable *uc, Er= ror **errp) net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize); net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize); =20 + QTAILQ_INSERT_TAIL(&net_compares, s, next); + g_queue_init(&s->conn_list); =20 s->connection_track_table =3D g_hash_table_new_full(connection_key_has= h, @@ -726,6 +794,10 @@ static void colo_compare_finalize(Object *obj) g_main_loop_quit(s->compare_loop); qemu_thread_join(&s->thread); =20 + if (!QTAILQ_EMPTY(&net_compares)) { + QTAILQ_REMOVE(&net_compares, s, next); + } + /* Release all unhandled packets after compare thead exited */ g_queue_foreach(&s->conn_list, colo_flush_packets, s); =20 diff --git a/net/colo-compare.h b/net/colo-compare.h new file mode 100644 index 0000000..ed823ed --- /dev/null +++ b/net/colo-compare.h @@ -0,0 +1,20 @@ +/* + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO) + * (a.k.a. Fault Tolerance or Continuous Replication) + * + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. + * Copyright (c) 2016 FUJITSU LIMITED + * Copyright (c) 2016 Intel Corporation + * + * Author: Zhang Chen + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + */ + +#ifndef QEMU_COLO_COMPARE_H +#define QEMU_COLO_COMPARE_H + +void colo_notify_compares_event(void *opaque, int event, Error **errp); + +#endif /* QEMU_COLO_COMPARE_H */ --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735645403487.06130389199564; Tue, 21 Feb 2017 19:54:05 -0800 (PST) Received: from localhost ([::1]:49690 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgO0B-00065Y-2q for importer@patchew.org; Tue, 21 Feb 2017 22:54:03 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42573) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpN-0004i3-Rs for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002tz-Gk for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:53 -0500 Received: from [45.249.212.187] (port=2542 helo=dggrg01-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pj-RZ for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.54 (EHLO DGGEMM401-HUB.china.huawei.com) ([172.30.72.54]) by dggrg01-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AJN08772; Wed, 22 Feb 2017 11:42:41 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM401-HUB.china.huawei.com (10.3.20.209) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:35 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:26 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:04 +0800 Message-ID: <1487734936-43472-4-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58AD08B2.011F, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 21b8539cf95e04734349cdb8867506ca X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.187 Subject: [Qemu-devel] [PATCH 03/15] colo-compare: use notifier to notify packets comparing result X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It's a good idea to use notifier to notify COLO frame of inconsistent packets comparing. Cc: Jason Wang Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang --- net/colo-compare.c | 32 ++++++++++++++++++++++++++++---- net/colo-compare.h | 2 ++ 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 61a8ee4..0886d7e 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -30,6 +30,7 @@ #include "qapi-visit.h" #include "net/colo.h" #include "net/colo-compare.h" +#include "migration/migration.h" =20 #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ @@ -38,6 +39,9 @@ static QTAILQ_HEAD(, CompareState) net_compares =3D QTAILQ_HEAD_INITIALIZER(net_compares); =20 +static NotifierList colo_compare_notifiers =3D + NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 =20 @@ -342,6 +346,22 @@ static int colo_old_packet_check_one(Packet *pkt, int6= 4_t *check_time) } } =20 +static void colo_compare_inconsistent_notify(void) +{ + notifier_list_notify(&colo_compare_notifiers, + migrate_get_current()); +} + +void colo_compare_register_notifier(Notifier *notify) +{ + notifier_list_add(&colo_compare_notifiers, notify); +} + +void colo_compare_unregister_notifier(Notifier *notify) +{ + notifier_remove(notify); +} + static void colo_old_packet_check_one_conn(void *opaque, void *user_data) { @@ -355,7 +375,7 @@ static void colo_old_packet_check_one_conn(void *opaque, =20 if (result) { /* do checkpoint will flush old packet */ - /* TODO: colo_notify_checkpoint();*/ + colo_compare_inconsistent_notify(); } } =20 @@ -373,7 +393,10 @@ static void colo_old_packet_check(void *opaque) =20 /* * Called from the compare thread on the primary - * for compare connection + * for compare connection. + * TODO: Reconstruct this function, we should hold the max handled sequence + * of the connect, Don't trigger a checkpoint request if we only get packe= ts + * from one side (primary or secondary). */ static void colo_compare_connection(void *opaque, void *user_data) { @@ -422,11 +445,12 @@ static void colo_compare_connection(void *opaque, voi= d *user_data) /* * If one packet arrive late, the secondary_list or * primary_list will be empty, so we can't compare it - * until next comparison. + * until next comparison. If the packets in the list are + * timeout, it will trigger a checkpoint request. */ trace_colo_compare_main("packet different"); g_queue_push_tail(&conn->primary_list, pkt); - /* TODO: colo_notify_checkpoint();*/ + colo_compare_inconsistent_notify(); break; } } diff --git a/net/colo-compare.h b/net/colo-compare.h index ed823ed..dc797ec 100644 --- a/net/colo-compare.h +++ b/net/colo-compare.h @@ -16,5 +16,7 @@ #define QEMU_COLO_COMPARE_H =20 void colo_notify_compares_event(void *opaque, int event, Error **errp); +void colo_compare_register_notifier(Notifier *notify); +void colo_compare_unregister_notifier(Notifier *notify); =20 #endif /* QEMU_COLO_COMPARE_H */ --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 148773510321999.7680998043777; Tue, 21 Feb 2017 19:45:03 -0800 (PST) Received: from localhost ([::1]:49642 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNrP-0005rL-Jl for importer@patchew.org; Tue, 21 Feb 2017 22:44:59 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42591) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpO-0004iA-Gt for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002tj-Fa for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:54 -0500 Received: from [45.249.212.187] (port=2539 helo=dggrg01-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pl-RL for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.54 (EHLO DGGEMM401-HUB.china.huawei.com) ([172.30.72.54]) by dggrg01-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AJN08765; Wed, 22 Feb 2017 11:42:40 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM401-HUB.china.huawei.com (10.3.20.209) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:35 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:26 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:05 +0800 Message-ID: <1487734936-43472-5-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.58AD08B2.00EC, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: f8cc05bc955f91a82c5cb635fb5c8bcf X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.187 Subject: [Qemu-devel] [PATCH 04/15] COLO: integrate colo compare with colo frame X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" For COLO FT, both the PVM and SVM run at the same time, only sync the state while it needs. So here, let SVM runs while not doing checkpoint, Besides, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100. Cc: Jason Wang Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 25 +++++++++++++++++++++++++ migration/migration.c | 2 +- 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/migration/colo.c b/migration/colo.c index 712308e..fb8d8fd 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -19,8 +19,11 @@ #include "qemu/error-report.h" #include "qapi/error.h" #include "migration/failover.h" +#include "net/colo-compare.h" +#include "net/colo.h" =20 static bool vmstate_loading; +static Notifier packets_compare_notifier; =20 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024) =20 @@ -263,6 +266,7 @@ static int colo_do_checkpoint_transaction(MigrationStat= e *s, if (local_err) { goto out; } + /* Reset channel-buffer directly */ qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL); bioc->usage =3D 0; @@ -283,6 +287,11 @@ static int colo_do_checkpoint_transaction(MigrationSta= te *s, goto out; } =20 + colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err); + if (local_err) { + goto out; + } + /* Disable block migration */ s->params.blk =3D 0; s->params.shared =3D 0; @@ -341,6 +350,11 @@ out: return ret; } =20 +static void colo_compare_notify_checkpoint(Notifier *notifier, void *data) +{ + colo_checkpoint_notify(data); +} + static void colo_process_checkpoint(MigrationState *s) { QIOChannelBuffer *bioc; @@ -357,6 +371,9 @@ static void colo_process_checkpoint(MigrationState *s) goto out; } =20 + packets_compare_notifier.notify =3D colo_compare_notify_checkpoint; + colo_compare_register_notifier(&packets_compare_notifier); + /* * Wait for Secondary finish loading VM states and enter COLO * restore. @@ -402,6 +419,7 @@ out: qemu_fclose(fb); } =20 + colo_compare_unregister_notifier(&packets_compare_notifier); timer_del(s->colo_delay_timer); =20 /* Hope this not to be too long to wait here */ @@ -518,6 +536,11 @@ void *colo_process_incoming_thread(void *opaque) goto out; } =20 + qemu_mutex_lock_iothread(); + vm_stop_force_state(RUN_STATE_COLO); + trace_colo_vm_state_change("run", "stop"); + qemu_mutex_unlock_iothread(); + /* FIXME: This is unnecessary for periodic checkpoint mode */ colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY, &local_err); @@ -571,6 +594,8 @@ void *colo_process_incoming_thread(void *opaque) } =20 vmstate_loading =3D false; + vm_start(); + trace_colo_vm_state_change("stop", "run"); qemu_mutex_unlock_iothread(); =20 if (failover_get_state() =3D=3D FAILOVER_STATUS_RELAUNCH) { diff --git a/migration/migration.c b/migration/migration.c index c6ae69d..2339be7 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -66,7 +66,7 @@ /* The delay time (in ms) between two COLO checkpoints * Note: Please change this default value to 10000 when we support hybrid = mode. */ -#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200 +#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100) =20 static NotifierList migration_state_notifiers =3D NOTIFIER_LIST_INITIALIZER(migration_state_notifiers); --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735379909476.5515485703112; Tue, 21 Feb 2017 19:49:39 -0800 (PST) Received: from localhost ([::1]:49666 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNvu-0001rl-LR for importer@patchew.org; Tue, 21 Feb 2017 22:49:38 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42588) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpO-0004i9-Aq for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002uW-OQ for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:54 -0500 Received: from [45.249.212.187] (port=2540 helo=dggrg01-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pn-Qs for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.54 (EHLO DGGEMM401-HUB.china.huawei.com) ([172.30.72.54]) by dggrg01-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AJN08766; Wed, 22 Feb 2017 11:42:41 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM401-HUB.china.huawei.com (10.3.20.209) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:35 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:27 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:06 +0800 Message-ID: <1487734936-43472-6-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020202.58AD08B2.01F4, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 2de2bde911e5e6059d169d2140c44f99 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.187 Subject: [Qemu-devel] [PATCH 05/15] COLO: Handle shutdown command for VM in COLO state X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paolo Bonzini , xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If VM is in COLO FT state, we need to do some extra works before starting normal shutdown process. Secondary VM will ignore the shutdown command if users issue it directly to Secondary VM. COLO will capture shutdown command and after shutdown request from user. Cc: Paolo Bonzini Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- v19: - fix title and comment v15: - Go on the shutdown process even some error happened while sent 'SHUTDOWN' message to SVM. - Add Reviewed-by tag v14: - Remove 'colo_shutdown' variable, use colo_shutdown_request directly v13: - Move COLO shutdown related codes to colo.c file (Dave's suggestion) --- include/migration/colo.h | 1 + include/sysemu/sysemu.h | 3 +++ migration/colo.c | 46 ++++++++++++++++++++++++++++++++++++++++++++= +- qapi-schema.json | 4 +++- vl.c | 19 ++++++++++++++++--- 5 files changed, 68 insertions(+), 5 deletions(-) diff --git a/include/migration/colo.h b/include/migration/colo.h index 2bbff9e..aadd040 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -37,4 +37,5 @@ COLOMode get_colo_mode(void); void colo_do_failover(MigrationState *s); =20 void colo_checkpoint_notify(void *opaque); +bool colo_handle_shutdown(void); #endif diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 576c7ce..7ed665a 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -49,6 +49,8 @@ typedef enum WakeupReason { QEMU_WAKEUP_REASON_OTHER, } WakeupReason; =20 +extern int colo_shutdown_requested; + void qemu_system_reset_request(void); void qemu_system_suspend_request(void); void qemu_register_suspend_notifier(Notifier *notifier); @@ -56,6 +58,7 @@ void qemu_system_wakeup_request(WakeupReason reason); void qemu_system_wakeup_enable(WakeupReason reason, bool enabled); void qemu_register_wakeup_notifier(Notifier *notifier); void qemu_system_shutdown_request(void); +void qemu_system_shutdown_request_core(void); void qemu_system_powerdown_request(void); void qemu_register_powerdown_notifier(Notifier *notifier); void qemu_system_debug_request(void); diff --git a/migration/colo.c b/migration/colo.c index fb8d8fd..4626435 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -336,6 +336,21 @@ static int colo_do_checkpoint_transaction(MigrationSta= te *s, goto out; } =20 + if (colo_shutdown_requested) { + colo_send_message(s->to_dst_file, COLO_MESSAGE_GUEST_SHUTDOWN, + &local_err); + if (local_err) { + error_free(local_err); + /* Go on the shutdown process and throw the error message */ + error_report("Failed to send shutdown message to SVM"); + } + qemu_fflush(s->to_dst_file); + colo_shutdown_requested =3D 0; + qemu_system_shutdown_request_core(); + /* Fix me: Just let the colo thread exit ? */ + qemu_thread_exit(0); + } + ret =3D 0; =20 qemu_mutex_lock_iothread(); @@ -401,7 +416,9 @@ static void colo_process_checkpoint(MigrationState *s) goto out; } =20 - qemu_sem_wait(&s->colo_checkpoint_sem); + if (!colo_shutdown_requested) { + qemu_sem_wait(&s->colo_checkpoint_sem); + } =20 ret =3D colo_do_checkpoint_transaction(s, bioc, fb); if (ret < 0) { @@ -477,6 +494,16 @@ static void colo_wait_handle_message(QEMUFile *f, int = *checkpoint_request, case COLO_MESSAGE_CHECKPOINT_REQUEST: *checkpoint_request =3D 1; break; + case COLO_MESSAGE_GUEST_SHUTDOWN: + qemu_mutex_lock_iothread(); + vm_stop_force_state(RUN_STATE_COLO); + qemu_system_shutdown_request_core(); + qemu_mutex_unlock_iothread(); + /* + * The main thread will be exit and terminate the whole + * process, do need some cleanup ? + */ + qemu_thread_exit(0); default: *checkpoint_request =3D 0; error_setg(errp, "Got unknown COLO message: %d", msg); @@ -634,3 +661,20 @@ out: =20 return NULL; } + +bool colo_handle_shutdown(void) +{ + /* + * If VM is in COLO-FT mode, we need do some significant work before + * respond to the shutdown request. Besides, Secondary VM will ignore + * the shutdown request from users. + */ + if (migration_incoming_in_colo_state()) { + return true; + } + if (migration_in_colo_state()) { + colo_shutdown_requested =3D 1; + return true; + } + return false; +} diff --git a/qapi-schema.json b/qapi-schema.json index e9a6364..0521054 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1157,12 +1157,14 @@ # # @vmstate-loaded: VM's state has been loaded by SVM. # +# @guest-shutdown: shutdown require from PVM to SVM +# # Since: 2.8 ## { 'enum': 'COLOMessage', 'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply', 'vmstate-send', 'vmstate-size', 'vmstate-received', - 'vmstate-loaded' ] } + 'vmstate-loaded', 'guest-shutdown' ] } =20 ## # @COLOMode: diff --git a/vl.c b/vl.c index b5d0a19..daad8df 100644 --- a/vl.c +++ b/vl.c @@ -1587,6 +1587,8 @@ static NotifierList wakeup_notifiers =3D NOTIFIER_LIST_INITIALIZER(wakeup_notifiers); static uint32_t wakeup_reason_mask =3D ~(1 << QEMU_WAKEUP_REASON_NONE); =20 +int colo_shutdown_requested; + int qemu_shutdown_requested_get(void) { return shutdown_requested; @@ -1713,7 +1715,10 @@ void qemu_system_guest_panicked(GuestPanicInformatio= n *info) void qemu_system_reset_request(void) { if (no_reboot) { - shutdown_requested =3D 1; + qemu_system_shutdown_request(); + if (!shutdown_requested) {/* colo handle it ? */ + return; + } } else { reset_requested =3D 1; } @@ -1786,14 +1791,22 @@ void qemu_system_killed(int signal, pid_t pid) qemu_notify_event(); } =20 -void qemu_system_shutdown_request(void) +void qemu_system_shutdown_request_core(void) { - trace_qemu_system_shutdown_request(); replay_shutdown_request(); shutdown_requested =3D 1; qemu_notify_event(); } =20 +void qemu_system_shutdown_request(void) +{ + trace_qemu_system_shutdown_request(); + if (colo_handle_shutdown()) { + return; + } + qemu_system_shutdown_request_core(); +} + static void qemu_system_powerdown(void) { qapi_event_send_powerdown(&error_abort); --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735380343702.3809071076314; Tue, 21 Feb 2017 19:49:40 -0800 (PST) Received: from localhost ([::1]:49667 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNvv-0001sH-2C for importer@patchew.org; Tue, 21 Feb 2017 22:49:39 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42586) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpO-0004i8-7v for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002ue-Pa for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:54 -0500 Received: from [45.249.212.187] (port=2543 helo=dggrg01-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002qZ-Qe for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.54 (EHLO DGGEMM401-HUB.china.huawei.com) ([172.30.72.54]) by dggrg01-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AJN08771; Wed, 22 Feb 2017 11:42:41 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM401-HUB.china.huawei.com (10.3.20.209) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:35 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:28 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:07 +0800 Message-ID: <1487734936-43472-7-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020205.58AD08B2.00EF, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 98b9ca3df27fd973b28f300a5fd64ed2 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.187 Subject: [Qemu-devel] [PATCH 06/15] COLO: Add block replication into colo process X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , lizhijian@cn.fujitsu.com, xiecl.fnst@cn.fujitsu.com, zhanghailiang , Wen Congyang , Max Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Make sure master start block replication after slave's block replication started. Signed-off-by: zhanghailiang Signed-off-by: Wen Congyang Signed-off-by: Li Zhijian Cc: Stefan Hajnoczi Cc: Kevin Wolf Cc: Max Reitz Cc: Changlong Xie --- migration/colo.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 56 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 4626435..1e3e975 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -21,6 +21,9 @@ #include "migration/failover.h" #include "net/colo-compare.h" #include "net/colo.h" +#include "qapi-event.h" +#include "block/block.h" +#include "replication.h" =20 static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -55,6 +58,7 @@ static void secondary_vm_do_failover(void) { int old_state; MigrationIncomingState *mis =3D migration_incoming_get_current(); + Error *local_err =3D NULL; =20 /* Can not do failover during the process of VM's loading VMstate, Or * it will break the secondary VM. @@ -72,6 +76,11 @@ static void secondary_vm_do_failover(void) migrate_set_state(&mis->state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); =20 + replication_stop_all(true, &local_err); + if (local_err) { + error_report_err(local_err); + } + if (!autostart) { error_report("\"-S\" qemu option will be ignored in secondary side= "); /* recover runstate to normal migration finish state */ @@ -109,6 +118,7 @@ static void primary_vm_do_failover(void) { MigrationState *s =3D migrate_get_current(); int old_state; + Error *local_err =3D NULL; =20 migrate_set_state(&s->state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); @@ -132,6 +142,12 @@ static void primary_vm_do_failover(void) FailoverStatus_lookup[old_state]); return; } + + replication_stop_all(true, &local_err); + if (local_err) { + error_report_err(local_err); + } + /* Notify COLO thread that failover work is finished */ qemu_sem_post(&s->colo_exit_sem); } @@ -297,6 +313,15 @@ static int colo_do_checkpoint_transaction(MigrationSta= te *s, s->params.shared =3D 0; qemu_savevm_state_header(fb); qemu_savevm_state_begin(fb, &s->params); + + /* We call this API although this may do nothing on primary side. */ + qemu_mutex_lock_iothread(); + replication_do_checkpoint_all(&local_err); + qemu_mutex_unlock_iothread(); + if (local_err) { + goto out; + } + qemu_mutex_lock_iothread(); qemu_savevm_state_complete_precopy(fb, false); qemu_mutex_unlock_iothread(); @@ -403,6 +428,12 @@ static void colo_process_checkpoint(MigrationState *s) object_unref(OBJECT(bioc)); =20 qemu_mutex_lock_iothread(); + replication_start_all(REPLICATION_MODE_PRIMARY, &local_err); + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } + vm_start(); qemu_mutex_unlock_iothread(); trace_colo_vm_state_change("stop", "run"); @@ -497,6 +528,7 @@ static void colo_wait_handle_message(QEMUFile *f, int *= checkpoint_request, case COLO_MESSAGE_GUEST_SHUTDOWN: qemu_mutex_lock_iothread(); vm_stop_force_state(RUN_STATE_COLO); + replication_stop_all(false, NULL); qemu_system_shutdown_request_core(); qemu_mutex_unlock_iothread(); /* @@ -544,6 +576,18 @@ void *colo_process_incoming_thread(void *opaque) fb =3D qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); =20 + qemu_mutex_lock_iothread(); + bdrv_invalidate_cache_all(&local_err); + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } + replication_start_all(REPLICATION_MODE_SECONDARY, &local_err); + qemu_mutex_unlock_iothread(); + if (local_err) { + goto out; + } + colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY, &local_err); if (local_err) { @@ -620,6 +664,18 @@ void *colo_process_incoming_thread(void *opaque) goto out; } =20 + replication_get_error_all(&local_err); + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } + /* discard colo disk buffer */ + replication_do_checkpoint_all(&local_err); + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } + vmstate_loading =3D false; vm_start(); trace_colo_vm_state_change("stop", "run"); --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 148773538427394.85454860361665; Tue, 21 Feb 2017 19:49:44 -0800 (PST) Received: from localhost ([::1]:49668 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNvx-0001u1-Mf for importer@patchew.org; Tue, 21 Feb 2017 22:49:41 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42580) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpO-0004i5-3l for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002uP-NZ for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:54 -0500 Received: from [45.249.212.189] (port=2399 helo=dggrg03-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pf-RQ for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.53 (EHLO DGGEMM403-HUB.china.huawei.com) ([172.30.72.53]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AIX52740; Wed, 22 Feb 2017 11:42:41 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM403-HUB.china.huawei.com (10.3.20.211) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:40 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:29 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:08 +0800 Message-ID: <1487734936-43472-8-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58AD08B2.0107, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 34825169dfde4a26e0cf49c81ad55531 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.189 Subject: [Qemu-devel] [PATCH 07/15] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com, Juan Quintela Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary= side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram ca= che always the same as PVM's memory at every checkpoint, then we flush this cac= hed ram to SVM after we receive all PVM's state. Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- include/exec/ram_addr.h | 1 + include/migration/migration.h | 4 +++ migration/colo.c | 14 +++++++++ migration/ram.c | 73 +++++++++++++++++++++++++++++++++++++++= +++- 4 files changed, 91 insertions(+), 1 deletion(-) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 3e79466..44e1190 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -27,6 +27,7 @@ struct RAMBlock { struct rcu_head rcu; struct MemoryRegion *mr; uint8_t *host; + uint8_t *colo_cache; /* For colo, VM's ram cache */ ram_addr_t offset; ram_addr_t used_length; ram_addr_t max_length; diff --git a/include/migration/migration.h b/include/migration/migration.h index 1735d66..93c6148 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -379,4 +379,8 @@ int ram_save_queue_pages(MigrationState *ms, const char= *rbname, PostcopyState postcopy_state_get(void); /* Set the state and return the old state */ PostcopyState postcopy_state_set(PostcopyState new_state); + +/* ram cache */ +int colo_init_ram_cache(void); +void colo_release_ram_cache(void); #endif diff --git a/migration/colo.c b/migration/colo.c index 1e3e975..edb7f00 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -551,6 +551,7 @@ void *colo_process_incoming_thread(void *opaque) uint64_t total_size; uint64_t value; Error *local_err =3D NULL; + int ret; =20 qemu_sem_init(&mis->colo_incoming_sem, 0); =20 @@ -572,6 +573,12 @@ void *colo_process_incoming_thread(void *opaque) */ qemu_file_set_blocking(mis->from_src_file, true); =20 + ret =3D colo_init_ram_cache(); + if (ret < 0) { + error_report("Failed to initialize ram cache"); + goto out; + } + bioc =3D qio_channel_buffer_new(COLO_BUFFER_BASE_SIZE); fb =3D qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); @@ -705,11 +712,18 @@ out: if (fb) { qemu_fclose(fb); } + /* + * We can ensure BH is hold the global lock, and will join COLO + * incoming thread, so here it is not necessary to lock here again, + * Or there will be a deadlock error. + */ + colo_release_ram_cache(); =20 /* Hope this not to be too long to loop here */ qemu_sem_wait(&mis->colo_incoming_sem); qemu_sem_destroy(&mis->colo_incoming_sem); /* Must be called after failover BH is completed */ + if (mis->to_src_file) { qemu_fclose(mis->to_src_file); } diff --git a/migration/ram.c b/migration/ram.c index f289fcd..b588990 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -219,6 +219,7 @@ static RAMBlock *last_sent_block; static ram_addr_t last_offset; static QemuMutex migration_bitmap_mutex; static uint64_t migration_dirty_pages; +static bool ram_cache_enable; static uint32_t last_version; static bool ram_bulk_stage; =20 @@ -2227,6 +2228,20 @@ static inline void *host_from_ram_block_offset(RAMBl= ock *block, return block->host + offset; } =20 +static inline void *colo_cache_from_block_offset(RAMBlock *block, + ram_addr_t offset) +{ + if (!offset_in_ramblock(block, offset)) { + return NULL; + } + if (!block->colo_cache) { + error_report("%s: colo_cache is NULL in block :%s", + __func__, block->idstr); + return NULL; + } + return block->colo_cache + offset; +} + /* * If a page (or a whole RDMA chunk) has been * determined to be zero, then zap it. @@ -2542,7 +2557,12 @@ static int ram_load(QEMUFile *f, void *opaque, int v= ersion_id) RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) { RAMBlock *block =3D ram_block_from_stream(f, flags); =20 - host =3D host_from_ram_block_offset(block, addr); + /* After going into COLO, we should load the Page into colo_ca= che */ + if (ram_cache_enable) { + host =3D colo_cache_from_block_offset(block, addr); + } else { + host =3D host_from_ram_block_offset(block, addr); + } if (!host) { error_report("Illegal RAM offset " RAM_ADDR_FMT, addr); ret =3D -EINVAL; @@ -2637,6 +2657,57 @@ static int ram_load(QEMUFile *f, void *opaque, int v= ersion_id) return ret; } =20 +/* + * colo cache: this is for secondary VM, we cache the whole + * memory of the secondary VM, it will be called after first migration. + */ +int colo_init_ram_cache(void) +{ + RAMBlock *block; + + rcu_read_lock(); + QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + block->colo_cache =3D qemu_anon_ram_alloc(block->used_length, NULL= ); + if (!block->colo_cache) { + error_report("%s: Can't alloc memory for COLO cache of block %= s," + "size 0x" RAM_ADDR_FMT, __func__, block->idstr, + block->used_length); + goto out_locked; + } + memcpy(block->colo_cache, block->host, block->used_length); + } + rcu_read_unlock(); + ram_cache_enable =3D true; + return 0; + +out_locked: + QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + if (block->colo_cache) { + qemu_anon_ram_free(block->colo_cache, block->used_length); + block->colo_cache =3D NULL; + } + } + + rcu_read_unlock(); + return -errno; +} + +void colo_release_ram_cache(void) +{ + RAMBlock *block; + + ram_cache_enable =3D false; + + rcu_read_lock(); + QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + if (block->colo_cache) { + qemu_anon_ram_free(block->colo_cache, block->used_length); + block->colo_cache =3D NULL; + } + } + rcu_read_unlock(); +} + static SaveVMHandlers savevm_ram_handlers =3D { .save_live_setup =3D ram_save_setup, .save_live_iterate =3D ram_save_iterate, --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735102422446.6177551305284; Tue, 21 Feb 2017 19:45:02 -0800 (PST) Received: from localhost ([::1]:49644 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNrQ-0005rb-LV for importer@patchew.org; Tue, 21 Feb 2017 22:45:00 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42583) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpO-0004i7-5c for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002tb-F0 for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:54 -0500 Received: from [45.249.212.189] (port=2396 helo=dggrg03-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pc-R7 for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.53 (EHLO DGGEMM403-HUB.china.huawei.com) ([172.30.72.53]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AIX52736; Wed, 22 Feb 2017 11:42:41 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM403-HUB.china.huawei.com (10.3.20.211) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:40 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:30 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:09 +0800 Message-ID: <1487734936-43472-9-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58AD08B1.02AF, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 35cd63ebd46ad3b7d8a1b836636a228c X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.189 Subject: [Qemu-devel] [PATCH 08/15] ram/COLO: Record the dirty pages that SVM received X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com, Juan Quintela Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. We record them by re-using migration dirty bitmap. Cc: Juan Quintela Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index b588990..ed3b606 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2231,6 +2231,9 @@ static inline void *host_from_ram_block_offset(RAMBlo= ck *block, static inline void *colo_cache_from_block_offset(RAMBlock *block, ram_addr_t offset) { + unsigned long *bitmap; + long k; + if (!offset_in_ramblock(block, offset)) { return NULL; } @@ -2239,6 +2242,17 @@ static inline void *colo_cache_from_block_offset(RAM= Block *block, __func__, block->idstr); return NULL; } + + k =3D (memory_region_get_ram_addr(block->mr) + offset) >> TARGET_PAGE_= BITS; + bitmap =3D atomic_rcu_read(&migration_bitmap_rcu)->bmap; + /* + * During colo checkpoint, we need bitmap of these migrated pages. + * It help us to decide which pages in ram cache should be flushed + * into VM's RAM later. + */ + if (!test_and_set_bit(k, bitmap)) { + migration_dirty_pages++; + } return block->colo_cache + offset; } =20 @@ -2664,6 +2678,7 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) int colo_init_ram_cache(void) { RAMBlock *block; + int64_t ram_cache_pages =3D last_ram_offset() >> TARGET_PAGE_BITS; =20 rcu_read_lock(); QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { @@ -2678,6 +2693,15 @@ int colo_init_ram_cache(void) } rcu_read_unlock(); ram_cache_enable =3D true; + /* + * Record the dirty pages that sent by PVM, we use this dirty bitmap to= gether + * with to decide which page in cache should be flushed into SVM's RAM.= Here + * we use the same name 'migration_bitmap_rcu' as for migration. + */ + migration_bitmap_rcu =3D g_new0(struct BitmapRcu, 1); + migration_bitmap_rcu->bmap =3D bitmap_new(ram_cache_pages); + migration_dirty_pages =3D 0; + return 0; =20 out_locked: @@ -2695,9 +2719,15 @@ out_locked: void colo_release_ram_cache(void) { RAMBlock *block; + struct BitmapRcu *bitmap =3D migration_bitmap_rcu; =20 ram_cache_enable =3D false; =20 + atomic_rcu_set(&migration_bitmap_rcu, NULL); + if (bitmap) { + call_rcu(bitmap, migration_bitmap_free, rcu); + } + rcu_read_lock(); QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { if (block->colo_cache) { --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735103000463.6148623324508; Tue, 21 Feb 2017 19:45:03 -0800 (PST) Received: from localhost ([::1]:49643 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNrP-0005rU-HB for importer@patchew.org; Tue, 21 Feb 2017 22:44:59 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42569) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpN-0004i2-If for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002tT-EW for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:53 -0500 Received: from [45.249.212.189] (port=2400 helo=dggrg03-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pi-Qj for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.53 (EHLO DGGEMM403-HUB.china.huawei.com) ([172.30.72.53]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AIX52743; Wed, 22 Feb 2017 11:42:43 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM403-HUB.china.huawei.com (10.3.20.211) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:40 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:31 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:10 +0800 Message-ID: <1487734936-43472-10-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.58AD08B3.00B6, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 284190f6ab331e9f6d011a955b26e4a0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.189 Subject: [Qemu-devel] [PATCH 09/15] COLO: Flush PVM's cached RAM into SVM's memory X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com, Juan Quintela Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkp= oint time. So, the content of SVM's RAM cache will always be same with PVM's mem= ory after checkpoint. Instead of flushing all content of PVM's RAM cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM since last checkpoint. In this way, we can ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- include/migration/migration.h | 1 + migration/ram.c | 41 +++++++++++++++++++++++++++++++++++++++= ++ migration/trace-events | 2 ++ 3 files changed, 44 insertions(+) diff --git a/include/migration/migration.h b/include/migration/migration.h index 93c6148..ba5b97b 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -383,4 +383,5 @@ PostcopyState postcopy_state_set(PostcopyState new_stat= e); /* ram cache */ int colo_init_ram_cache(void); void colo_release_ram_cache(void); +void colo_flush_ram_cache(void); #endif diff --git a/migration/ram.c b/migration/ram.c index ed3b606..3f57fe0 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2540,6 +2540,7 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) * be atomic */ bool postcopy_running =3D postcopy_state_get() >=3D POSTCOPY_INCOMING_= LISTENING; + bool need_flush =3D false; =20 seq_iter++; =20 @@ -2574,6 +2575,7 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) /* After going into COLO, we should load the Page into colo_ca= che */ if (ram_cache_enable) { host =3D colo_cache_from_block_offset(block, addr); + need_flush =3D true; } else { host =3D host_from_ram_block_offset(block, addr); } @@ -2668,6 +2670,10 @@ static int ram_load(QEMUFile *f, void *opaque, int v= ersion_id) wait_for_decompress_done(); rcu_read_unlock(); trace_ram_load_complete(ret, seq_iter); + + if (!ret && ram_cache_enable && need_flush) { + colo_flush_ram_cache(); + } return ret; } =20 @@ -2738,6 +2744,41 @@ void colo_release_ram_cache(void) rcu_read_unlock(); } =20 +/* + * Flush content of RAM cache into SVM's memory. + * Only flush the pages that be dirtied by PVM or SVM or both. + */ +void colo_flush_ram_cache(void) +{ + RAMBlock *block =3D NULL; + void *dst_host; + void *src_host; + ram_addr_t offset =3D 0; + + trace_colo_flush_ram_cache_begin(migration_dirty_pages); + rcu_read_lock(); + block =3D QLIST_FIRST_RCU(&ram_list.blocks); + + while (block) { + ram_addr_t ram_addr_abs; + offset =3D migration_bitmap_find_dirty(block, offset, &ram_addr_ab= s); + migration_bitmap_clear_dirty(ram_addr_abs); + + if (offset >=3D block->used_length) { + offset =3D 0; + block =3D QLIST_NEXT_RCU(block, next); + } else { + dst_host =3D block->host + offset; + src_host =3D block->colo_cache + offset; + memcpy(dst_host, src_host, TARGET_PAGE_SIZE); + } + } + + rcu_read_unlock(); + trace_colo_flush_ram_cache_end(); + assert(migration_dirty_pages =3D=3D 0); +} + static SaveVMHandlers savevm_ram_handlers =3D { .save_live_setup =3D ram_save_setup, .save_live_iterate =3D ram_save_iterate, diff --git a/migration/trace-events b/migration/trace-events index fa660e3..5d4cf80 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -71,6 +71,8 @@ migration_throttle(void) "" ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x" ram_postcopy_send_discard_bitmap(void) "" ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: st= art: %zx len: %zx" +colo_flush_ram_cache_begin(uint64_t dirty_pages) "dirty_pages %" PRIu64 +colo_flush_ram_cache_end(void) "" =20 # migration/migration.c await_return_path_close_on_source_close(void) "" --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735900829900.2831268582771; Tue, 21 Feb 2017 19:58:20 -0800 (PST) Received: from localhost ([::1]:49713 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgO4J-0002Kx-J5 for importer@patchew.org; Tue, 21 Feb 2017 22:58:19 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42676) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpS-0004l4-P0 for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpR-0002yI-Ma for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:58 -0500 Received: from [45.249.212.189] (port=2401 helo=dggrg03-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpR-0002wo-2h for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:57 -0500 Received: from 172.30.72.53 (EHLO DGGEMM403-HUB.china.huawei.com) ([172.30.72.53]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AIX52741; Wed, 22 Feb 2017 11:42:42 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM403-HUB.china.huawei.com (10.3.20.211) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:40 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:32 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:11 +0800 Message-ID: <1487734936-43472-11-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.58AD08B3.0031, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 0fd1cd8251bec586b3b40c00a697cb5f X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.189 Subject: [Qemu-devel] [PATCH 10/15] qmp event: Add COLO_EXIT event to notify users while exited from COLO X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michael Roth , xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com, Markus Armbruster Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x_colo_lost_heartbeat', Users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users that we exited COLO mode. Cc: Markus Armbruster Cc: Michael Roth Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Eric Blake --- migration/colo.c | 19 +++++++++++++++++++ qapi-schema.json | 14 ++++++++++++++ qapi/event.json | 21 +++++++++++++++++++++ 3 files changed, 54 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index edb7f00..65d0802 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -468,6 +468,18 @@ out: } =20 colo_compare_unregister_notifier(&packets_compare_notifier); + /* + * There are only two reasons we can go here, some error happened. + * Or the user triggered failover. + */ + if (failover_get_state() =3D=3D FAILOVER_STATUS_NONE) { + qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_ERROR, NULL); + } else { + qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_REQUEST, NULL); + } + timer_del(s->colo_delay_timer); =20 /* Hope this not to be too long to wait here */ @@ -708,6 +720,13 @@ out: if (local_err) { error_report_err(local_err); } + if (failover_get_state() =3D=3D FAILOVER_STATUS_NONE) { + qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_ERROR, NULL); + } else { + qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_REQUEST, NULL); + } =20 if (fb) { qemu_fclose(fb); diff --git a/qapi-schema.json b/qapi-schema.json index 0521054..bb73e8f 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1203,6 +1203,20 @@ 'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] } =20 ## +# @COLOExitReason: +# +# The reason for a COLO exit +# +# @request: COLO exit is due to an external request +# +# @error: COLO exit is due to an internal error +# +# Since: 2.9 +## +{ 'enum': 'COLOExitReason', + 'data': [ 'request', 'error' ] } + +## # @x-colo-lost-heartbeat: # # Tell qemu that heartbeat is lost, request it to do takeover procedures. diff --git a/qapi/event.json b/qapi/event.json index 970ff02..fe33628 100644 --- a/qapi/event.json +++ b/qapi/event.json @@ -441,6 +441,27 @@ 'data': { 'pass': 'int' } } =20 ## +# @COLO_EXIT: +# +# Emitted when VM finishes COLO mode due to some errors happening or +# at the request of users. +# +# @mode: which COLO mode the VM was in when it exited. +# +# @reason: describes the reason for the COLO exit. +# +# Since: 2.9 +# +# Example: +# +# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172}, +# "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "reques= t" } } +# +## +{ 'event': 'COLO_EXIT', + 'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } } + +## # @ACPI_DEVICE_OST: # # Emitted when guest executes ACPI _OST method. --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735386022561.0100874784185; Tue, 21 Feb 2017 19:49:46 -0800 (PST) Received: from localhost ([::1]:49669 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNw0-0001wH-Nd for importer@patchew.org; Tue, 21 Feb 2017 22:49:44 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42582) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpO-0004i6-4o for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002tL-E9 for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:54 -0500 Received: from [45.249.212.189] (port=2398 helo=dggrg03-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pa-Qb for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.53 (EHLO DGGEMM403-HUB.china.huawei.com) ([172.30.72.53]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AIX52738; Wed, 22 Feb 2017 11:42:41 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM403-HUB.china.huawei.com (10.3.20.211) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:40 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:33 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:12 +0800 Message-ID: <1487734936-43472-12-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58AD08B3.0022, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 35196897032f94d7bd429c45c29385b6 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.189 Subject: [Qemu-devel] [PATCH 11/15] savevm: split save/find loadvm_handlers entry into two helper functions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com, Juan Quintela Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" COLO's checkpoint process is based on migration process, everytime we do checkpoint we will repeat the process of savevm and loadvm. So we will call qemu_loadvm_section_start_full() repeatedly, It will add all migration sections information into loadvm_handlers list everytime, which will lead to memory leak. To fix it, we split the process of saving and finding section entry into two helper functions, we will check if section info was exist in loadvm_handlers list before save it. This modifications have no side effect for normal migration. Cc: Juan Quintela Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/savevm.c | 55 +++++++++++++++++++++++++++++++++++++++-----------= ---- 1 file changed, 40 insertions(+), 15 deletions(-) diff --git a/migration/savevm.c b/migration/savevm.c index 5ecd264..9c2d239 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1821,6 +1821,37 @@ void loadvm_free_handlers(MigrationIncomingState *mi= s) } } =20 +static LoadStateEntry *loadvm_add_section_entry(MigrationIncomingState *mi= s, + SaveStateEntry *se, + uint32_t section_id, + uint32_t version_id) +{ + LoadStateEntry *le; + + /* Add entry */ + le =3D g_malloc0(sizeof(*le)); + + le->se =3D se; + le->section_id =3D section_id; + le->version_id =3D version_id; + QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry); + return le; +} + +static LoadStateEntry *loadvm_find_section_entry(MigrationIncomingState *m= is, + uint32_t section_id) +{ + LoadStateEntry *le; + + QLIST_FOREACH(le, &mis->loadvm_handlers, entry) { + if (le->section_id =3D=3D section_id) { + break; + } + } + + return le; +} + static int qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis) { @@ -1863,15 +1894,12 @@ qemu_loadvm_section_start_full(QEMUFile *f, Migrati= onIncomingState *mis) return -EINVAL; } =20 - /* Add entry */ - le =3D g_malloc0(sizeof(*le)); - - le->se =3D se; - le->section_id =3D section_id; - le->version_id =3D version_id; - QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry); - - ret =3D vmstate_load(f, le->se, le->version_id); + /* Check if we have saved this section info before, if not, save it */ + le =3D loadvm_find_section_entry(mis, section_id); + if (!le) { + le =3D loadvm_add_section_entry(mis, se, section_id, version_id); + } + ret =3D vmstate_load(f, se, version_id); if (ret < 0) { error_report("error while loading state for instance 0x%x of" " device '%s'", instance_id, idstr); @@ -1894,12 +1922,9 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationI= ncomingState *mis) section_id =3D qemu_get_be32(f); =20 trace_qemu_loadvm_state_section_partend(section_id); - QLIST_FOREACH(le, &mis->loadvm_handlers, entry) { - if (le->section_id =3D=3D section_id) { - break; - } - } - if (le =3D=3D NULL) { + + le =3D loadvm_find_section_entry(mis, section_id); + if (!le) { error_report("Unknown savevm section %d", section_id); return -EINVAL; } --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735899614417.4105548345442; Tue, 21 Feb 2017 19:58:19 -0800 (PST) Received: from localhost ([::1]:49712 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgO4I-0002KQ-E2 for importer@patchew.org; Tue, 21 Feb 2017 22:58:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42577) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpO-0004i4-1l for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpJ-0002uT-OP for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:54 -0500 Received: from [45.249.212.189] (port=2397 helo=dggrg03-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpI-0002pd-RR for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:49 -0500 Received: from 172.30.72.53 (EHLO DGGEMM403-HUB.china.huawei.com) ([172.30.72.53]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AIX52737; Wed, 22 Feb 2017 11:42:41 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM403-HUB.china.huawei.com (10.3.20.211) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:40 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:34 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:13 +0800 Message-ID: <1487734936-43472-13-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.58AD08B2.0157, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 37a276f36ba67e2fdd88b5a09919f256 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.189 Subject: [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com, Juan Quintela Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_loadvm_state_begin(), qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public. Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- include/sysemu/sysemu.h | 6 ++++++ migration/savevm.c | 55 ++++++++++++++++++++++++++++++++++++++++++---= ---- 2 files changed, 54 insertions(+), 7 deletions(-) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 7ed665a..95cae41 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -132,7 +132,13 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f= , const char *name, uint64_t *start_list, uint64_t *length_list); =20 +void qemu_savevm_live_state(QEMUFile *f); +int qemu_save_device_state(QEMUFile *f); + int qemu_loadvm_state(QEMUFile *f); +int qemu_loadvm_state_begin(QEMUFile *f); +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis); +int qemu_load_device_state(QEMUFile *f); =20 extern int autostart; =20 diff --git a/migration/savevm.c b/migration/savevm.c index 9c2d239..dac478b 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -54,6 +54,7 @@ #include "qemu/cutils.h" #include "io/channel-buffer.h" #include "io/channel-file.h" +#include "migration/colo.h" =20 #ifndef ETH_P_RARP #define ETH_P_RARP 0x8035 @@ -1279,13 +1280,21 @@ done: return ret; } =20 -static int qemu_save_device_state(QEMUFile *f) +void qemu_savevm_live_state(QEMUFile *f) { - SaveStateEntry *se; + /* save QEMU_VM_SECTION_END section */ + qemu_savevm_state_complete_precopy(f, true); + qemu_put_byte(f, QEMU_VM_EOF); +} =20 - qemu_put_be32(f, QEMU_VM_FILE_MAGIC); - qemu_put_be32(f, QEMU_VM_FILE_VERSION); +int qemu_save_device_state(QEMUFile *f) +{ + SaveStateEntry *se; =20 + if (!migration_in_colo_state()) { + qemu_put_be32(f, QEMU_VM_FILE_MAGIC); + qemu_put_be32(f, QEMU_VM_FILE_VERSION); + } cpu_synchronize_all_states(); =20 QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { @@ -1336,8 +1345,6 @@ enum LoadVMExitCodes { LOADVM_QUIT =3D 1, }; =20 -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis= ); - /* ------ incoming postcopy messages ------ */ /* 'advise' arrives before any transfers just to tell us that a postcopy * *might* happen - it might be skipped if precopy transferred everything @@ -1942,7 +1949,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIn= comingState *mis) return 0; } =20 -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis) +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis) { uint8_t section_type; int ret =3D 0; @@ -2080,6 +2087,40 @@ int qemu_loadvm_state(QEMUFile *f) return ret; } =20 +int qemu_loadvm_state_begin(QEMUFile *f) +{ + MigrationIncomingState *mis =3D migration_incoming_get_current(); + Error *local_err =3D NULL; + int ret; + + if (qemu_savevm_state_blocked(&local_err)) { + error_report_err(local_err); + return -EINVAL; + } + /* Load QEMU_VM_SECTION_START section */ + ret =3D qemu_loadvm_state_main(f, mis); + if (ret < 0) { + error_report("Failed to loadvm begin work: %d", ret); + } + return ret; +} + +int qemu_load_device_state(QEMUFile *f) +{ + MigrationIncomingState *mis =3D migration_incoming_get_current(); + int ret; + + /* Load QEMU_VM_SECTION_FULL section */ + ret =3D qemu_loadvm_state_main(f, mis); + if (ret < 0) { + error_report("Failed to load device state: %d", ret); + return ret; + } + + cpu_synchronize_all_post_init(); + return 0; +} + int save_vmstate(Monitor *mon, const char *name) { BlockDriverState *bs, *bs1; --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735897963780.4822833902367; Tue, 21 Feb 2017 19:58:17 -0800 (PST) Received: from localhost ([::1]:49710 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgO4G-0002HP-DY for importer@patchew.org; Tue, 21 Feb 2017 22:58:16 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42643) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpR-0004jP-Aj for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpN-0002vu-0H for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:57 -0500 Received: from [45.249.212.188] (port=2934 helo=dggrg02-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpM-0002ru-4m for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:52 -0500 Received: from 172.30.72.55 (EHLO DGGEMM402-HUB.china.huawei.com) ([172.30.72.55]) by dggrg02-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AIU51189; Wed, 22 Feb 2017 11:42:46 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM402-HUB.china.huawei.com (10.3.20.210) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:44 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:34 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:14 +0800 Message-ID: <1487734936-43472-14-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58AD08B7.0165, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 2b842c3f974fe0d56fc1bed8773d526d X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.188 Subject: [Qemu-devel] [PATCH 13/15] COLO: Separate the process of saving/loading ram and device state X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com, Juan Quintela Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We separate the process of saving/loading ram and device state when do checkpoint. We add new helpers for save/load ram/device. With this change, we can directly transfer RAM from primary side to secondary side without using channel-buffer as assistant, which also reduce the size of extra memo= ry was used during checkpoint. Besides, we move the colo_flush_ram_cache to the proper position after the above change. Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 48 ++++++++++++++++++++++++++++++++++++++---------- migration/ram.c | 5 ----- migration/savevm.c | 4 ++++ 3 files changed, 42 insertions(+), 15 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 65d0802..b17e8e3 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -308,11 +308,20 @@ static int colo_do_checkpoint_transaction(MigrationSt= ate *s, goto out; } =20 + colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_er= r); + if (local_err) { + goto out; + } + /* Disable block migration */ s->params.blk =3D 0; s->params.shared =3D 0; - qemu_savevm_state_header(fb); - qemu_savevm_state_begin(fb, &s->params); + qemu_savevm_state_begin(s->to_dst_file, &s->params); + ret =3D qemu_file_get_error(s->to_dst_file); + if (ret < 0) { + error_report("Save VM state begin error"); + goto out; + } =20 /* We call this API although this may do nothing on primary side. */ qemu_mutex_lock_iothread(); @@ -323,15 +332,21 @@ static int colo_do_checkpoint_transaction(MigrationSt= ate *s, } =20 qemu_mutex_lock_iothread(); - qemu_savevm_state_complete_precopy(fb, false); + /* + * Only save VM's live state, which not including device state. + * TODO: We may need a timeout mechanism to prevent COLO process + * to be blocked here. + */ + qemu_savevm_live_state(s->to_dst_file); + /* Note: device state is saved into buffer */ + ret =3D qemu_save_device_state(fb); qemu_mutex_unlock_iothread(); - - qemu_fflush(fb); - - colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_er= r); - if (local_err) { + if (ret < 0) { + error_report("Save device state error"); goto out; } + qemu_fflush(fb); + /* * We need the size of the VMstate data in Secondary side, * With which we can decide how much data should be read. @@ -644,6 +659,17 @@ void *colo_process_incoming_thread(void *opaque) goto out; } =20 + ret =3D qemu_loadvm_state_begin(mis->from_src_file); + if (ret < 0) { + error_report("Load vm state begin error, ret=3D%d", ret); + goto out; + } + ret =3D qemu_loadvm_state_main(mis->from_src_file, mis); + if (ret < 0) { + error_report("Load VM's live state (ram) error"); + goto out; + } + value =3D colo_receive_message_value(mis->from_src_file, COLO_MESSAGE_VMSTATE_SIZE, &local_err); if (local_err) { @@ -677,8 +703,10 @@ void *colo_process_incoming_thread(void *opaque) qemu_mutex_lock_iothread(); qemu_system_reset(VMRESET_SILENT); vmstate_loading =3D true; - if (qemu_loadvm_state(fb) < 0) { - error_report("COLO: loadvm failed"); + colo_flush_ram_cache(); + ret =3D qemu_load_device_state(fb); + if (ret < 0) { + error_report("COLO: load device state failed"); qemu_mutex_unlock_iothread(); goto out; } diff --git a/migration/ram.c b/migration/ram.c index 3f57fe0..6227b94 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2540,7 +2540,6 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) * be atomic */ bool postcopy_running =3D postcopy_state_get() >=3D POSTCOPY_INCOMING_= LISTENING; - bool need_flush =3D false; =20 seq_iter++; =20 @@ -2575,7 +2574,6 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) /* After going into COLO, we should load the Page into colo_ca= che */ if (ram_cache_enable) { host =3D colo_cache_from_block_offset(block, addr); - need_flush =3D true; } else { host =3D host_from_ram_block_offset(block, addr); } @@ -2671,9 +2669,6 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) rcu_read_unlock(); trace_ram_load_complete(ret, seq_iter); =20 - if (!ret && ram_cache_enable && need_flush) { - colo_flush_ram_cache(); - } return ret; } =20 diff --git a/migration/savevm.c b/migration/savevm.c index dac478b..67e4306 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1002,6 +1002,10 @@ void qemu_savevm_state_begin(QEMUFile *f, break; } } + if (migration_in_colo_state()) { + qemu_put_byte(f, QEMU_VM_EOF); + qemu_fflush(f); + } } =20 /* --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735642902936.1189920706171; Tue, 21 Feb 2017 19:54:02 -0800 (PST) Received: from localhost ([::1]:49688 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgO09-00061M-6j for importer@patchew.org; Tue, 21 Feb 2017 22:54:01 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42638) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpR-0004j6-0S for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpM-0002vj-Op for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:57 -0500 Received: from [45.249.212.188] (port=2935 helo=dggrg02-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpM-0002rv-4z for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:52 -0500 Received: from 172.30.72.55 (EHLO DGGEMM402-HUB.china.huawei.com) ([172.30.72.55]) by dggrg02-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AIU51190; Wed, 22 Feb 2017 11:42:46 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM402-HUB.china.huawei.com (10.3.20.210) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:44 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:35 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:15 +0800 Message-ID: <1487734936-43472-15-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020203.58AD08B7.0148, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: a8fe2aad47ce1dfac14fe19574d91e14 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.188 Subject: [Qemu-devel] [PATCH 14/15] COLO: Split qemu_savevm_state_begin out of checkpoint process X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com, Juan Quintela Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It is unnecessary to call qemu_savevm_state_begin() in every checkpoint pro= cess. It mainly sets up devices and does the first device state pass. These data = will not change during the later checkpoint process. So, we split it out of colo_do_checkpoint_transaction(), in this way, we can reduce these data transferring in the subsequent checkpoint. Cc: Juan Quintela Sgned-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 52 ++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 36 insertions(+), 16 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index b17e8e3..ab2d700 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -313,16 +313,6 @@ static int colo_do_checkpoint_transaction(MigrationSta= te *s, goto out; } =20 - /* Disable block migration */ - s->params.blk =3D 0; - s->params.shared =3D 0; - qemu_savevm_state_begin(s->to_dst_file, &s->params); - ret =3D qemu_file_get_error(s->to_dst_file); - if (ret < 0) { - error_report("Save VM state begin error"); - goto out; - } - /* We call this API although this may do nothing on primary side. */ qemu_mutex_lock_iothread(); replication_do_checkpoint_all(&local_err); @@ -410,6 +400,21 @@ static void colo_compare_notify_checkpoint(Notifier *n= otifier, void *data) colo_checkpoint_notify(data); } =20 +static int colo_prepare_before_save(MigrationState *s) +{ + int ret; + + /* Disable block migration */ + s->params.blk =3D 0; + s->params.shared =3D 0; + qemu_savevm_state_begin(s->to_dst_file, &s->params); + ret =3D qemu_file_get_error(s->to_dst_file); + if (ret < 0) { + error_report("Save VM state begin error"); + } + return ret; +} + static void colo_process_checkpoint(MigrationState *s) { QIOChannelBuffer *bioc; @@ -429,6 +434,11 @@ static void colo_process_checkpoint(MigrationState *s) packets_compare_notifier.notify =3D colo_compare_notify_checkpoint; colo_compare_register_notifier(&packets_compare_notifier); =20 + ret =3D colo_prepare_before_save(s); + if (ret < 0) { + goto out; + } + /* * Wait for Secondary finish loading VM states and enter COLO * restore. @@ -570,6 +580,17 @@ static void colo_wait_handle_message(QEMUFile *f, int = *checkpoint_request, } } =20 +static int colo_prepare_before_load(QEMUFile *f) +{ + int ret; + + ret =3D qemu_loadvm_state_begin(f); + if (ret < 0) { + error_report("Load VM state begin error, ret =3D %d", ret); + } + return ret; +} + void *colo_process_incoming_thread(void *opaque) { MigrationIncomingState *mis =3D opaque; @@ -610,6 +631,11 @@ void *colo_process_incoming_thread(void *opaque) fb =3D qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); =20 + ret =3D colo_prepare_before_load(mis->from_src_file); + if (ret < 0) { + goto out; + } + qemu_mutex_lock_iothread(); bdrv_invalidate_cache_all(&local_err); if (local_err) { @@ -621,7 +647,6 @@ void *colo_process_incoming_thread(void *opaque) if (local_err) { goto out; } - colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY, &local_err); if (local_err) { @@ -659,11 +684,6 @@ void *colo_process_incoming_thread(void *opaque) goto out; } =20 - ret =3D qemu_loadvm_state_begin(mis->from_src_file); - if (ret < 0) { - error_report("Load vm state begin error, ret=3D%d", ret); - goto out; - } ret =3D qemu_loadvm_state_main(mis->from_src_file, mis); if (ret < 0) { error_report("Load VM's live state (ram) error"); --=20 1.8.3.1 From nobody Thu Nov 6 18:58:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487735646341664.1086026924372; Tue, 21 Feb 2017 19:54:06 -0800 (PST) Received: from localhost ([::1]:49691 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgO0D-000668-2p for importer@patchew.org; Tue, 21 Feb 2017 22:54:05 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42636) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgNpR-0004j4-03 for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgNpM-0002ve-OR for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:57 -0500 Received: from [45.249.212.188] (port=2933 helo=dggrg02-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgNpM-0002rt-53 for qemu-devel@nongnu.org; Tue, 21 Feb 2017 22:42:52 -0500 Received: from 172.30.72.55 (EHLO DGGEMM402-HUB.china.huawei.com) ([172.30.72.55]) by dggrg02-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AIU51193; Wed, 22 Feb 2017 11:42:46 +0800 (CST) Received: from DGGEML401-HUB.china.huawei.com (10.3.17.32) by DGGEMM402-HUB.china.huawei.com (10.3.20.210) with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 22 Feb 2017 11:42:45 +0800 Received: from localhost (10.177.24.212) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Feb 2017 11:42:36 +0800 From: zhanghailiang To: , , Date: Wed, 22 Feb 2017 11:42:16 +0800 Message-ID: <1487734936-43472-16-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1487734936-43472-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020205.58AD08B7.0138, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 8e973c2ff52813e6ecc4df6b60cacc96 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.188 Subject: [Qemu-devel] [PATCH 15/15] COLO: flush host dirty ram from cache X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xiecl.fnst@cn.fujitsu.com, zhanghailiang , lizhijian@cn.fujitsu.com, Juan Quintela Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Don't need to flush all VM's ram from cache, only flush the dirty pages since last checkpoint Cc: Juan Quintela Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang --- migration/ram.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 6227b94..e9ba740 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2702,6 +2702,7 @@ int colo_init_ram_cache(void) migration_bitmap_rcu =3D g_new0(struct BitmapRcu, 1); migration_bitmap_rcu->bmap =3D bitmap_new(ram_cache_pages); migration_dirty_pages =3D 0; + memory_global_dirty_log_start(); =20 return 0; =20 @@ -2750,6 +2751,15 @@ void colo_flush_ram_cache(void) void *src_host; ram_addr_t offset =3D 0; =20 + memory_global_dirty_log_sync(); + qemu_mutex_lock(&migration_bitmap_mutex); + rcu_read_lock(); + QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + migration_bitmap_sync_range(block->offset, block->used_length); + } + rcu_read_unlock(); + qemu_mutex_unlock(&migration_bitmap_mutex); + trace_colo_flush_ram_cache_begin(migration_dirty_pages); rcu_read_lock(); block =3D QLIST_FIRST_RCU(&ram_list.blocks); --=20 1.8.3.1