From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919507907176.83355760909376; Thu, 18 Oct 2018 20:25:07 -0700 (PDT) Received: from localhost ([::1]:46456 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLPF-0003qr-Se for importer@patchew.org; Thu, 18 Oct 2018 23:24:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36256) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLN3-0002ME-PV for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLMz-0002W1-TT for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59910) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLMz-0002QY-FI for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:37 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 61D29BB1E; Fri, 19 Oct 2018 03:22:33 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0F9A55C69A; Fri, 19 Oct 2018 03:22:30 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:00 +0800 Message-Id: <1539919345-10703-2-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 19 Oct 2018 03:22:33 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 01/26] filter-rewriter: Add TCP state machine and fix memory leak in connection_track_table X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen We add almost full TCP state machine in filter-rewriter, except TCPS_LISTEN and some simplify in VM active close FIN states. The reason for this simplify job is because guest kernel will track the TCP status and wait 2MSL time too, if client resend the FIN packet, guest will resend the last ACK, so we needn't wait 2MSL time in filter-rewr= iter. After a net connection is closed, we didn't clear its related resources in connection_track_table, which will lead to memory leak. Let's track the state of net connection, if it is closed, its related resources will be cleared up. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- net/colo.c | 2 +- net/colo.h | 9 ++--- net/filter-rewriter.c | 109 +++++++++++++++++++++++++++++++++++++++++++++-= ---- 3 files changed, 104 insertions(+), 16 deletions(-) diff --git a/net/colo.c b/net/colo.c index 6dda4ed..97c8fc9 100644 --- a/net/colo.c +++ b/net/colo.c @@ -137,7 +137,7 @@ Connection *connection_new(ConnectionKey *key) conn->ip_proto =3D key->ip_proto; conn->processing =3D false; conn->offset =3D 0; - conn->syn_flag =3D 0; + conn->tcp_state =3D TCPS_CLOSED; conn->pack =3D 0; conn->sack =3D 0; g_queue_init(&conn->primary_list); diff --git a/net/colo.h b/net/colo.h index da6c36d..0277e0e 100644 --- a/net/colo.h +++ b/net/colo.h @@ -18,6 +18,7 @@ #include "slirp/slirp.h" #include "qemu/jhash.h" #include "qemu/timer.h" +#include "slirp/tcp.h" =20 #define HASHTABLE_MAX_SIZE 16384 =20 @@ -81,11 +82,9 @@ typedef struct Connection { uint32_t sack; /* offset =3D secondary_seq - primary_seq */ tcp_seq offset; - /* - * we use this flag update offset func - * run once in independent tcp connection - */ - int syn_flag; + + int tcp_state; /* TCP FSM state */ + tcp_seq fin_ack_seq; /* the seq of 'fin=3D1,ack=3D1' */ } Connection; =20 uint32_t connection_key_hash(const void *opaque); diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index f584e4e..dd323fa 100644 --- a/net/filter-rewriter.c +++ b/net/filter-rewriter.c @@ -59,9 +59,9 @@ static int is_tcp_packet(Packet *pkt) } =20 /* handle tcp packet from primary guest */ -static int handle_primary_tcp_pkt(NetFilterState *nf, +static int handle_primary_tcp_pkt(RewriterState *rf, Connection *conn, - Packet *pkt) + Packet *pkt, ConnectionKey *key) { struct tcphdr *tcp_pkt; =20 @@ -74,23 +74,28 @@ static int handle_primary_tcp_pkt(NetFilterState *nf, trace_colo_filter_rewriter_conn_offset(conn->offset); } =20 + if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) =3D=3D (TH_ACK | TH_SYN))= && + conn->tcp_state =3D=3D TCPS_SYN_SENT) { + conn->tcp_state =3D TCPS_ESTABLISHED; + } + if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) =3D=3D TH_SYN)) { /* * we use this flag update offset func * run once in independent tcp connection */ - conn->syn_flag =3D 1; + conn->tcp_state =3D TCPS_SYN_RECEIVED; } =20 if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) =3D=3D TH_ACK)) { - if (conn->syn_flag) { + if (conn->tcp_state =3D=3D TCPS_SYN_RECEIVED) { /* * offset =3D secondary_seq - primary seq * ack packet sent by guest from primary node, * so we use th_ack - 1 get primary_seq */ conn->offset -=3D (ntohl(tcp_pkt->th_ack) - 1); - conn->syn_flag =3D 0; + conn->tcp_state =3D TCPS_ESTABLISHED; } if (conn->offset) { /* handle packets to the secondary from the primary */ @@ -99,15 +104,66 @@ static int handle_primary_tcp_pkt(NetFilterState *nf, net_checksum_calculate((uint8_t *)pkt->data + pkt->vnet_hdr_le= n, pkt->size - pkt->vnet_hdr_len); } + + /* + * Passive close step 3 + */ + if ((conn->tcp_state =3D=3D TCPS_LAST_ACK) && + (ntohl(tcp_pkt->th_ack) =3D=3D (conn->fin_ack_seq + 1))) { + conn->tcp_state =3D TCPS_CLOSED; + g_hash_table_remove(rf->connection_track_table, key); + } + } + + if ((tcp_pkt->th_flags & TH_FIN) =3D=3D TH_FIN) { + /* + * Passive close. + * Step 1: + * The *server* side of this connect is VM, *client* tries to close + * the connection. We will into CLOSE_WAIT status. + * + * Step 2: + * In this step we will into LAST_ACK status. + * + * We got 'fin=3D1, ack=3D1' packet from server side, we need to + * record the seq of 'fin=3D1, ack=3D1' packet. + * + * Step 3: + * We got 'ack=3D1' packets from client side, it acks 'fin=3D1, ac= k=3D1' + * packet from server side. From this point, we can ensure that th= ere + * will be no packets in the connection, except that, some errors + * happen between the path of 'filter object' and vNIC, if this ra= re + * case really happen, we can still create a new connection, + * So it is safe to remove the connection from connection_track_ta= ble. + * + */ + if (conn->tcp_state =3D=3D TCPS_ESTABLISHED) { + conn->tcp_state =3D TCPS_CLOSE_WAIT; + } + + /* + * Active close step 2. + */ + if (conn->tcp_state =3D=3D TCPS_FIN_WAIT_1) { + conn->tcp_state =3D TCPS_TIME_WAIT; + /* + * For simplify implementation, we needn't wait 2MSL time + * in filter rewriter. Because guest kernel will track the + * TCP status and wait 2MSL time, if client resend the FIN + * packet, guest will apply the last ACK too. + */ + conn->tcp_state =3D TCPS_CLOSED; + g_hash_table_remove(rf->connection_track_table, key); + } } =20 return 0; } =20 /* handle tcp packet from secondary guest */ -static int handle_secondary_tcp_pkt(NetFilterState *nf, +static int handle_secondary_tcp_pkt(RewriterState *rf, Connection *conn, - Packet *pkt) + Packet *pkt, ConnectionKey *key) { struct tcphdr *tcp_pkt; =20 @@ -121,7 +177,8 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf, trace_colo_filter_rewriter_conn_offset(conn->offset); } =20 - if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) =3D=3D (TH_ACK | TH_SYN))= ) { + if (conn->tcp_state =3D=3D TCPS_SYN_RECEIVED && + ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) =3D=3D (TH_ACK | TH_SYN))= ) { /* * save offset =3D secondary_seq and then * in handle_primary_tcp_pkt make offset @@ -130,6 +187,12 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf, conn->offset =3D ntohl(tcp_pkt->th_seq); } =20 + /* VM active connect */ + if (conn->tcp_state =3D=3D TCPS_CLOSED && + ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) =3D=3D TH_SYN)) { + conn->tcp_state =3D TCPS_SYN_SENT; + } + if ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) =3D=3D TH_ACK) { /* Only need to adjust seq while offset is Non-zero */ if (conn->offset) { @@ -141,6 +204,32 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf, } } =20 + /* + * Passive close step 2: + */ + if (conn->tcp_state =3D=3D TCPS_CLOSE_WAIT && + (tcp_pkt->th_flags & (TH_ACK | TH_FIN)) =3D=3D (TH_ACK | TH_FIN)) { + conn->fin_ack_seq =3D ntohl(tcp_pkt->th_seq); + conn->tcp_state =3D TCPS_LAST_ACK; + } + + /* + * Active close + * + * Step 1: + * The *server* side of this connect is VM, *server* tries to close + * the connection. + * + * Step 2: + * We will into CLOSE_WAIT status. + * We simplify the TCPS_FIN_WAIT_2, TCPS_TIME_WAIT and + * CLOSING status. + */ + if (conn->tcp_state =3D=3D TCPS_ESTABLISHED && + (tcp_pkt->th_flags & (TH_ACK | TH_FIN)) =3D=3D TH_FIN) { + conn->tcp_state =3D TCPS_FIN_WAIT_1; + } + return 0; } =20 @@ -190,7 +279,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState= *nf, =20 if (sender =3D=3D nf->netdev) { /* NET_FILTER_DIRECTION_TX */ - if (!handle_primary_tcp_pkt(nf, conn, pkt)) { + if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) { qemu_net_queue_send(s->incoming_queue, sender, 0, (const uint8_t *)pkt->data, pkt->size, NULL); packet_destroy(pkt, NULL); @@ -203,7 +292,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState= *nf, } } else { /* NET_FILTER_DIRECTION_RX */ - if (!handle_secondary_tcp_pkt(nf, conn, pkt)) { + if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) { qemu_net_queue_send(s->incoming_queue, sender, 0, (const uint8_t *)pkt->data, pkt->size, NULL); packet_destroy(pkt, NULL); --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919504192796.9082529187257; Thu, 18 Oct 2018 20:25:04 -0700 (PDT) Received: from localhost ([::1]:46457 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLPF-0003sH-Bw for importer@patchew.org; Thu, 18 Oct 2018 23:24:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36257) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLN3-0002MF-Pd for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLMz-0002Vq-RW for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37816) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLMz-0002UW-FU for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:37 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 32786723D9; Fri, 19 Oct 2018 03:22:36 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id E1C195C69A; Fri, 19 Oct 2018 03:22:33 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:01 +0800 Message-Id: <1539919345-10703-3-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 19 Oct 2018 03:22:36 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 02/26] colo-compare: implement the process of checkpoint X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen While do checkpoint, we need to flush all the unhandled packets, By using the filter notifier mechanism, we can easily to notify every compare object to do this process, which runs inside of compare threads as a coroutine. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- include/migration/colo.h | 6 ++++ net/colo-compare.c | 78 ++++++++++++++++++++++++++++++++++++++++++++= ++++ net/colo-compare.h | 22 ++++++++++++++ 3 files changed, 106 insertions(+) create mode 100644 net/colo-compare.h diff --git a/include/migration/colo.h b/include/migration/colo.h index 2fe48ad..fefb2fc 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -16,6 +16,12 @@ #include "qemu-common.h" #include "qapi/qapi-types-migration.h" =20 +enum colo_event { + COLO_EVENT_NONE, + COLO_EVENT_CHECKPOINT, + COLO_EVENT_FAILOVER, +}; + void colo_info_init(void); =20 void migrate_start_colo_process(MigrationState *s); diff --git a/net/colo-compare.c b/net/colo-compare.c index dd745a4..80e6532 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -27,11 +27,16 @@ #include "qemu/sockets.h" #include "colo.h" #include "sysemu/iothread.h" +#include "net/colo-compare.h" +#include "migration/colo.h" =20 #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE) =20 +static QTAILQ_HEAD(, CompareState) net_compares =3D + QTAILQ_HEAD_INITIALIZER(net_compares); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 =20 @@ -41,6 +46,10 @@ /* TODO: Should be configurable */ #define REGULAR_PACKET_CHECK_MS 3000 =20 +static QemuMutex event_mtx; +static QemuCond event_complete_cond; +static int event_unhandled_count; + /* * + CompareState ++ * | | @@ -87,6 +96,11 @@ typedef struct CompareState { IOThread *iothread; GMainContext *worker_context; QEMUTimer *packet_check_timer; + + QEMUBH *event_bh; + enum colo_event event; + + QTAILQ_ENTRY(CompareState) next; } CompareState; =20 typedef struct CompareClass { @@ -736,6 +750,25 @@ static void check_old_packet_regular(void *opaque) REGULAR_PACKET_CHECK_MS); } =20 +/* Public API, Used for COLO frame to notify compare event */ +void colo_notify_compares_event(void *opaque, int event, Error **errp) +{ + CompareState *s; + + qemu_mutex_lock(&event_mtx); + QTAILQ_FOREACH(s, &net_compares, next) { + s->event =3D event; + qemu_bh_schedule(s->event_bh); + event_unhandled_count++; + } + /* Wait all compare threads to finish handling this event */ + while (event_unhandled_count > 0) { + qemu_cond_wait(&event_complete_cond, &event_mtx); + } + + qemu_mutex_unlock(&event_mtx); +} + static void colo_compare_timer_init(CompareState *s) { AioContext *ctx =3D iothread_get_aio_context(s->iothread); @@ -756,6 +789,30 @@ static void colo_compare_timer_del(CompareState *s) } } =20 +static void colo_flush_packets(void *opaque, void *user_data); + +static void colo_compare_handle_event(void *opaque) +{ + CompareState *s =3D opaque; + + switch (s->event) { + case COLO_EVENT_CHECKPOINT: + g_queue_foreach(&s->conn_list, colo_flush_packets, s); + break; + case COLO_EVENT_FAILOVER: + break; + default: + break; + } + + assert(event_unhandled_count > 0); + + qemu_mutex_lock(&event_mtx); + event_unhandled_count--; + qemu_cond_broadcast(&event_complete_cond); + qemu_mutex_unlock(&event_mtx); +} + static void colo_compare_iothread(CompareState *s) { object_ref(OBJECT(s->iothread)); @@ -769,6 +826,7 @@ static void colo_compare_iothread(CompareState *s) s, s->worker_context, true); =20 colo_compare_timer_init(s); + s->event_bh =3D qemu_bh_new(colo_compare_handle_event, s); } =20 static char *compare_get_pri_indev(Object *obj, Error **errp) @@ -926,8 +984,13 @@ static void colo_compare_complete(UserCreatable *uc, E= rror **errp) net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize, s->vnet_hdr); net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize, s->vnet_hdr); =20 + QTAILQ_INSERT_TAIL(&net_compares, s, next); + g_queue_init(&s->conn_list); =20 + qemu_mutex_init(&event_mtx); + qemu_cond_init(&event_complete_cond); + s->connection_track_table =3D g_hash_table_new_full(connection_key_has= h, connection_key_equal, g_free, @@ -990,6 +1053,7 @@ static void colo_compare_init(Object *obj) static void colo_compare_finalize(Object *obj) { CompareState *s =3D COLO_COMPARE(obj); + CompareState *tmp =3D NULL; =20 qemu_chr_fe_deinit(&s->chr_pri_in, false); qemu_chr_fe_deinit(&s->chr_sec_in, false); @@ -997,6 +1061,16 @@ static void colo_compare_finalize(Object *obj) if (s->iothread) { colo_compare_timer_del(s); } + + qemu_bh_delete(s->event_bh); + + QTAILQ_FOREACH(tmp, &net_compares, next) { + if (tmp =3D=3D s) { + QTAILQ_REMOVE(&net_compares, s, next); + break; + } + } + /* Release all unhandled packets after compare thead exited */ g_queue_foreach(&s->conn_list, colo_flush_packets, s); =20 @@ -1009,6 +1083,10 @@ static void colo_compare_finalize(Object *obj) if (s->iothread) { object_unref(OBJECT(s->iothread)); } + + qemu_mutex_destroy(&event_mtx); + qemu_cond_destroy(&event_complete_cond); + g_free(s->pri_indev); g_free(s->sec_indev); g_free(s->outdev); diff --git a/net/colo-compare.h b/net/colo-compare.h new file mode 100644 index 0000000..1b1ce76 --- /dev/null +++ b/net/colo-compare.h @@ -0,0 +1,22 @@ +/* + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO) + * (a.k.a. Fault Tolerance or Continuous Replication) + * + * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD. + * Copyright (c) 2017 FUJITSU LIMITED + * Copyright (c) 2017 Intel Corporation + * + * Authors: + * zhanghailiang + * Zhang Chen + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + */ + +#ifndef QEMU_COLO_COMPARE_H +#define QEMU_COLO_COMPARE_H + +void colo_notify_compares_event(void *opaque, int event, Error **errp); + +#endif /* QEMU_COLO_COMPARE_H */ --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919679228868.7890489785807; Thu, 18 Oct 2018 20:27:59 -0700 (PDT) Received: from localhost ([::1]:46473 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLS7-0006Dz-Sb for importer@patchew.org; Thu, 18 Oct 2018 23:27:55 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36261) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLN3-0002MG-SR for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLN2-0002ZU-BZ for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60130) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLN1-0002Xy-QP for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:40 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F2DE030012EE; Fri, 19 Oct 2018 03:22:38 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id B04675C69A; Fri, 19 Oct 2018 03:22:36 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:02 +0800 Message-Id: <1539919345-10703-4-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Fri, 19 Oct 2018 03:22:39 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 03/26] colo-compare: use notifier to notify packets comparing result X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen It's a good idea to use notifier to notify COLO frame of inconsistent packets comparing. Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang Signed-off-by: Jason Wang --- net/colo-compare.c | 37 ++++++++++++++++++++++++++----------- net/colo-compare.h | 2 ++ 2 files changed, 28 insertions(+), 11 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 80e6532..3f7e240 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -29,6 +29,7 @@ #include "sysemu/iothread.h" #include "net/colo-compare.h" #include "migration/colo.h" +#include "migration/migration.h" =20 #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ @@ -37,6 +38,9 @@ static QTAILQ_HEAD(, CompareState) net_compares =3D QTAILQ_HEAD_INITIALIZER(net_compares); =20 +static NotifierList colo_compare_notifiers =3D + NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 =20 @@ -326,6 +330,12 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet *sp= kt, return false; } =20 +static void colo_compare_inconsistency_notify(void) +{ + notifier_list_notify(&colo_compare_notifiers, + migrate_get_current()); +} + static void colo_compare_tcp(CompareState *s, Connection *conn) { Packet *ppkt =3D NULL, *spkt =3D NULL; @@ -427,10 +437,7 @@ sec: qemu_hexdump((char *)spkt->data, stderr, "colo-compare spkt", spkt->size); =20 - /* - * colo_compare_inconsistent_notify(); - * TODO: notice to checkpoint(); - */ + colo_compare_inconsistency_notify(); } } =20 @@ -561,8 +568,18 @@ static int colo_old_packet_check_one(Packet *pkt, int6= 4_t *check_time) } } =20 +void colo_compare_register_notifier(Notifier *notify) +{ + notifier_list_add(&colo_compare_notifiers, notify); +} + +void colo_compare_unregister_notifier(Notifier *notify) +{ + notifier_remove(notify); +} + static int colo_old_packet_check_one_conn(Connection *conn, - void *user_data) + void *user_data) { GList *result =3D NULL; int64_t check_time =3D REGULAR_PACKET_CHECK_MS; @@ -573,10 +590,7 @@ static int colo_old_packet_check_one_conn(Connection *= conn, =20 if (result) { /* Do checkpoint will flush old packet */ - /* - * TODO: Notify colo frame to do checkpoint. - * colo_compare_inconsistent_notify(); - */ + colo_compare_inconsistency_notify(); return 0; } =20 @@ -620,11 +634,12 @@ static void colo_compare_packet(CompareState *s, Conn= ection *conn, /* * If one packet arrive late, the secondary_list or * primary_list will be empty, so we can't compare it - * until next comparison. + * until next comparison. If the packets in the list are + * timeout, it will trigger a checkpoint request. */ trace_colo_compare_main("packet different"); g_queue_push_head(&conn->primary_list, pkt); - /* TODO: colo_notify_checkpoint();*/ + colo_compare_inconsistency_notify(); break; } } diff --git a/net/colo-compare.h b/net/colo-compare.h index 1b1ce76..22ddd51 100644 --- a/net/colo-compare.h +++ b/net/colo-compare.h @@ -18,5 +18,7 @@ #define QEMU_COLO_COMPARE_H =20 void colo_notify_compares_event(void *opaque, int event, Error **errp); +void colo_compare_register_notifier(Notifier *notify); +void colo_compare_unregister_notifier(Notifier *notify); =20 #endif /* QEMU_COLO_COMPARE_H */ --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919507930834.513126730603; Thu, 18 Oct 2018 20:25:07 -0700 (PDT) Received: from localhost ([::1]:46458 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLPI-0003sq-E0 for importer@patchew.org; Thu, 18 Oct 2018 23:25:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36338) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLN6-0002MZ-D8 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLN5-0002e0-3o for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41996) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLN4-0002cj-MP for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:42 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C1C46308212A; Fri, 19 Oct 2018 03:22:41 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7E4B85C69A; Fri, 19 Oct 2018 03:22:39 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:03 +0800 Message-Id: <1539919345-10703-5-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Fri, 19 Oct 2018 03:22:41 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 04/26] COLO: integrate colo compare with colo frame X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen For COLO FT, both the PVM and SVM run at the same time, only sync the state while it needs. So here, let SVM runs while not doing checkpoint, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100. Besides, we forgot to release colo_checkpoint_semd and colo_delay_timer, fix them here. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Jason Wang --- migration/colo.c | 42 ++++++++++++++++++++++++++++++++++++++++-- migration/migration.c | 6 ++---- 2 files changed, 42 insertions(+), 6 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 88936f5..f4bdfde 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -25,8 +25,11 @@ #include "qemu/error-report.h" #include "migration/failover.h" #include "replication.h" +#include "net/colo-compare.h" +#include "net/colo.h" =20 static bool vmstate_loading; +static Notifier packets_compare_notifier; =20 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024) =20 @@ -343,6 +346,11 @@ static int colo_do_checkpoint_transaction(MigrationSta= te *s, goto out; } =20 + colo_notify_compares_event(NULL, COLO_EVENT_CHECKPOINT, &local_err); + if (local_err) { + goto out; + } + /* Disable block migration */ migrate_set_block_enabled(false, &local_err); qemu_savevm_state_header(fb); @@ -400,6 +408,11 @@ out: return ret; } =20 +static void colo_compare_notify_checkpoint(Notifier *notifier, void *data) +{ + colo_checkpoint_notify(data); +} + static void colo_process_checkpoint(MigrationState *s) { QIOChannelBuffer *bioc; @@ -416,6 +429,9 @@ static void colo_process_checkpoint(MigrationState *s) goto out; } =20 + packets_compare_notifier.notify =3D colo_compare_notify_checkpoint; + colo_compare_register_notifier(&packets_compare_notifier); + /* * Wait for Secondary finish loading VM states and enter COLO * restore. @@ -461,11 +477,21 @@ out: qemu_fclose(fb); } =20 - timer_del(s->colo_delay_timer); - /* Hope this not to be too long to wait here */ qemu_sem_wait(&s->colo_exit_sem); qemu_sem_destroy(&s->colo_exit_sem); + + /* + * It is safe to unregister notifier after failover finished. + * Besides, colo_delay_timer and colo_checkpoint_sem can't be + * released befor unregister notifier, or there will be use-after-free + * error. + */ + colo_compare_unregister_notifier(&packets_compare_notifier); + timer_del(s->colo_delay_timer); + timer_free(s->colo_delay_timer); + qemu_sem_destroy(&s->colo_checkpoint_sem); + /* * Must be called after failover BH is completed, * Or the failover BH may shutdown the wrong fd that @@ -559,6 +585,11 @@ void *colo_process_incoming_thread(void *opaque) fb =3D qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); =20 + qemu_mutex_lock_iothread(); + vm_start(); + trace_colo_vm_state_change("stop", "run"); + qemu_mutex_unlock_iothread(); + colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY, &local_err); if (local_err) { @@ -578,6 +609,11 @@ void *colo_process_incoming_thread(void *opaque) goto out; } =20 + qemu_mutex_lock_iothread(); + vm_stop_force_state(RUN_STATE_COLO); + trace_colo_vm_state_change("run", "stop"); + qemu_mutex_unlock_iothread(); + /* FIXME: This is unnecessary for periodic checkpoint mode */ colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY, &local_err); @@ -631,6 +667,8 @@ void *colo_process_incoming_thread(void *opaque) } =20 vmstate_loading =3D false; + vm_start(); + trace_colo_vm_state_change("stop", "run"); qemu_mutex_unlock_iothread(); =20 if (failover_get_state() =3D=3D FAILOVER_STATUS_RELAUNCH) { diff --git a/migration/migration.c b/migration/migration.c index d6ae879..32ce058 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -76,10 +76,8 @@ /* Migration XBZRLE default cache size */ #define DEFAULT_MIGRATE_XBZRLE_CACHE_SIZE (64 * 1024 * 1024) =20 -/* The delay time (in ms) between two COLO checkpoints - * Note: Please change this default value to 10000 when we support hybrid = mode. - */ -#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200 +/* The delay time (in ms) between two COLO checkpoints */ +#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100) #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2 #define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 16 =20 --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919679556227.96030880851356; Thu, 18 Oct 2018 20:27:59 -0700 (PDT) Received: from localhost ([::1]:46474 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLS8-0006Ep-5H for importer@patchew.org; Thu, 18 Oct 2018 23:27:56 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36423) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNA-0002N2-7r for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLN8-0002hb-Id for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55792) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLN7-0002g7-OC for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:46 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D8CB983F45; Fri, 19 Oct 2018 03:22:44 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4EE51171A8; Fri, 19 Oct 2018 03:22:42 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:04 +0800 Message-Id: <1539919345-10703-6-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 19 Oct 2018 03:22:44 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 05/26] COLO: Add block replication into colo process X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Li Zhijian , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen Make sure master start block replication after slave's block replication started. Besides, we need to activate VM's blocks before goes into COLO state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- migration/colo.c | 43 +++++++++++++++++++++++++++++++++++++++++++ migration/migration.c | 10 ++++++++++ 2 files changed, 53 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index f4bdfde..af04010 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -27,6 +27,7 @@ #include "replication.h" #include "net/colo-compare.h" #include "net/colo.h" +#include "block/block.h" =20 static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -56,6 +57,7 @@ static void secondary_vm_do_failover(void) { int old_state; MigrationIncomingState *mis =3D migration_incoming_get_current(); + Error *local_err =3D NULL; =20 /* Can not do failover during the process of VM's loading VMstate, Or * it will break the secondary VM. @@ -73,6 +75,11 @@ static void secondary_vm_do_failover(void) migrate_set_state(&mis->state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); =20 + replication_stop_all(true, &local_err); + if (local_err) { + error_report_err(local_err); + } + if (!autostart) { error_report("\"-S\" qemu option will be ignored in secondary side= "); /* recover runstate to normal migration finish state */ @@ -110,6 +117,7 @@ static void primary_vm_do_failover(void) { MigrationState *s =3D migrate_get_current(); int old_state; + Error *local_err =3D NULL; =20 migrate_set_state(&s->state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); @@ -133,6 +141,13 @@ static void primary_vm_do_failover(void) FailoverStatus_str(old_state)); return; } + + replication_stop_all(true, &local_err); + if (local_err) { + error_report_err(local_err); + local_err =3D NULL; + } + /* Notify COLO thread that failover work is finished */ qemu_sem_post(&s->colo_exit_sem); } @@ -356,6 +371,11 @@ static int colo_do_checkpoint_transaction(MigrationSta= te *s, qemu_savevm_state_header(fb); qemu_savevm_state_setup(fb); qemu_mutex_lock_iothread(); + replication_do_checkpoint_all(&local_err); + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } qemu_savevm_state_complete_precopy(fb, false, false); qemu_mutex_unlock_iothread(); =20 @@ -446,6 +466,12 @@ static void colo_process_checkpoint(MigrationState *s) object_unref(OBJECT(bioc)); =20 qemu_mutex_lock_iothread(); + replication_start_all(REPLICATION_MODE_PRIMARY, &local_err); + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } + vm_start(); qemu_mutex_unlock_iothread(); trace_colo_vm_state_change("stop", "run"); @@ -586,6 +612,11 @@ void *colo_process_incoming_thread(void *opaque) object_unref(OBJECT(bioc)); =20 qemu_mutex_lock_iothread(); + replication_start_all(REPLICATION_MODE_SECONDARY, &local_err); + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } vm_start(); trace_colo_vm_state_change("stop", "run"); qemu_mutex_unlock_iothread(); @@ -666,6 +697,18 @@ void *colo_process_incoming_thread(void *opaque) goto out; } =20 + replication_get_error_all(&local_err); + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } + /* discard colo disk buffer */ + replication_do_checkpoint_all(&local_err); + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } + vmstate_loading =3D false; vm_start(); trace_colo_vm_state_change("stop", "run"); diff --git a/migration/migration.c b/migration/migration.c index 32ce058..bf5fcd1 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -386,6 +386,7 @@ static void process_incoming_migration_co(void *opaque) MigrationIncomingState *mis =3D migration_incoming_get_current(); PostcopyState ps; int ret; + Error *local_err =3D NULL; =20 assert(mis->from_src_file); mis->migration_incoming_co =3D qemu_coroutine_self(); @@ -418,6 +419,15 @@ static void process_incoming_migration_co(void *opaque) =20 /* we get COLO info, and know if we are in COLO mode */ if (!ret && migration_incoming_enable_colo()) { + /* Make sure all file formats flush their mutable metadata */ + bdrv_invalidate_cache_all(&local_err); + if (local_err) { + migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE, + MIGRATION_STATUS_FAILED); + error_report_err(local_err); + exit(EXIT_FAILURE); + } + qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming", colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE); mis->have_colo_incoming_thread =3D true; --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919862678435.80465550893143; Thu, 18 Oct 2018 20:31:02 -0700 (PDT) Received: from localhost ([::1]:46487 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLUv-0000jP-Pu for importer@patchew.org; Thu, 18 Oct 2018 23:30:49 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36507) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNG-0002Sk-0w for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNA-0002kH-VT for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58092) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNA-0002jA-Dx for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:48 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A2C6D30F5586; Fri, 19 Oct 2018 03:22:47 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 62AA35C69A; Fri, 19 Oct 2018 03:22:45 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:05 +0800 Message-Id: <1539919345-10703-7-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 19 Oct 2018 03:22:47 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 06/26] COLO: Remove colo_state migration struct X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen We need to know if migration is going into COLO state for incoming side before start normal migration. Instead by using the VMStateDescription to send colo_state from source side to destination side, we use MIG_CMD_ENABLE_COLO to indicate whether COLO is enabled or not. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Jason Wang --- include/migration/colo.h | 5 ++-- migration/Makefile.objs | 2 +- migration/colo-comm.c | 76 --------------------------------------------= ---- migration/colo.c | 13 ++++++++- migration/migration.c | 23 ++++++++++++++- migration/savevm.c | 17 +++++++++++ migration/savevm.h | 1 + migration/trace-events | 1 + vl.c | 2 -- 9 files changed, 57 insertions(+), 83 deletions(-) delete mode 100644 migration/colo-comm.c diff --git a/include/migration/colo.h b/include/migration/colo.h index fefb2fc..99ce17a 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -28,8 +28,9 @@ void migrate_start_colo_process(MigrationState *s); bool migration_in_colo_state(void); =20 /* loadvm */ -bool migration_incoming_enable_colo(void); -void migration_incoming_exit_colo(void); +void migration_incoming_enable_colo(void); +void migration_incoming_disable_colo(void); +bool migration_incoming_colo_enabled(void); void *colo_process_incoming_thread(void *opaque); bool migration_incoming_in_colo_state(void); =20 diff --git a/migration/Makefile.objs b/migration/Makefile.objs index c83ec47..a4f3baf 100644 --- a/migration/Makefile.objs +++ b/migration/Makefile.objs @@ -1,6 +1,6 @@ common-obj-y +=3D migration.o socket.o fd.o exec.o common-obj-y +=3D tls.o channel.o savevm.o -common-obj-y +=3D colo-comm.o colo.o colo-failover.o +common-obj-y +=3D colo.o colo-failover.o common-obj-y +=3D vmstate.o vmstate-types.o page_cache.o common-obj-y +=3D qemu-file.o global_state.o common-obj-y +=3D qemu-file-channel.o diff --git a/migration/colo-comm.c b/migration/colo-comm.c deleted file mode 100644 index df26e4d..0000000 --- a/migration/colo-comm.c +++ /dev/null @@ -1,76 +0,0 @@ -/* - * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO) - * (a.k.a. Fault Tolerance or Continuous Replication) - * - * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. - * Copyright (c) 2016 FUJITSU LIMITED - * Copyright (c) 2016 Intel Corporation - * - * This work is licensed under the terms of the GNU GPL, version 2 or - * later. See the COPYING file in the top-level directory. - * - */ - -#include "qemu/osdep.h" -#include "migration.h" -#include "migration/colo.h" -#include "migration/vmstate.h" -#include "trace.h" - -typedef struct { - bool colo_requested; -} COLOInfo; - -static COLOInfo colo_info; - -COLOMode get_colo_mode(void) -{ - if (migration_in_colo_state()) { - return COLO_MODE_PRIMARY; - } else if (migration_incoming_in_colo_state()) { - return COLO_MODE_SECONDARY; - } else { - return COLO_MODE_UNKNOWN; - } -} - -static int colo_info_pre_save(void *opaque) -{ - COLOInfo *s =3D opaque; - - s->colo_requested =3D migrate_colo_enabled(); - - return 0; -} - -static bool colo_info_need(void *opaque) -{ - return migrate_colo_enabled(); -} - -static const VMStateDescription colo_state =3D { - .name =3D "COLOState", - .version_id =3D 1, - .minimum_version_id =3D 1, - .pre_save =3D colo_info_pre_save, - .needed =3D colo_info_need, - .fields =3D (VMStateField[]) { - VMSTATE_BOOL(colo_requested, COLOInfo), - VMSTATE_END_OF_LIST() - }, -}; - -void colo_info_init(void) -{ - vmstate_register(NULL, 0, &colo_state, &colo_info); -} - -bool migration_incoming_enable_colo(void) -{ - return colo_info.colo_requested; -} - -void migration_incoming_exit_colo(void) -{ - colo_info.colo_requested =3D false; -} diff --git a/migration/colo.c b/migration/colo.c index af04010..d3163b5 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -152,6 +152,17 @@ static void primary_vm_do_failover(void) qemu_sem_post(&s->colo_exit_sem); } =20 +COLOMode get_colo_mode(void) +{ + if (migration_in_colo_state()) { + return COLO_MODE_PRIMARY; + } else if (migration_incoming_in_colo_state()) { + return COLO_MODE_SECONDARY; + } else { + return COLO_MODE_UNKNOWN; + } +} + void colo_do_failover(MigrationState *s) { /* Make sure VM stopped while failover happened. */ @@ -746,7 +757,7 @@ out: if (mis->to_src_file) { qemu_fclose(mis->to_src_file); } - migration_incoming_exit_colo(); + migration_incoming_disable_colo(); =20 rcu_unregister_thread(); return NULL; diff --git a/migration/migration.c b/migration/migration.c index bf5fcd1..215e81a 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -296,6 +296,22 @@ int migrate_send_rp_req_pages(MigrationIncomingState *= mis, const char *rbname, return migrate_send_rp_message(mis, msg_type, msglen, bufc); } =20 +static bool migration_colo_enabled; +bool migration_incoming_colo_enabled(void) +{ + return migration_colo_enabled; +} + +void migration_incoming_disable_colo(void) +{ + migration_colo_enabled =3D false; +} + +void migration_incoming_enable_colo(void) +{ + migration_colo_enabled =3D true; +} + void qemu_start_incoming_migration(const char *uri, Error **errp) { const char *p; @@ -418,7 +434,7 @@ static void process_incoming_migration_co(void *opaque) } =20 /* we get COLO info, and know if we are in COLO mode */ - if (!ret && migration_incoming_enable_colo()) { + if (!ret && migration_incoming_colo_enabled()) { /* Make sure all file formats flush their mutable metadata */ bdrv_invalidate_cache_all(&local_err); if (local_err) { @@ -3025,6 +3041,11 @@ static void *migration_thread(void *opaque) qemu_savevm_send_postcopy_advise(s->to_dst_file); } =20 + if (migrate_colo_enabled()) { + /* Notify migration destination that we enable COLO */ + qemu_savevm_send_colo_enable(s->to_dst_file); + } + qemu_savevm_state_setup(s->to_dst_file); =20 s->setup_time =3D qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start; diff --git a/migration/savevm.c b/migration/savevm.c index 2d10e45..09ad962 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -56,6 +56,7 @@ #include "io/channel-file.h" #include "sysemu/replay.h" #include "qjson.h" +#include "migration/colo.h" =20 #ifndef ETH_P_RARP #define ETH_P_RARP 0x8035 @@ -82,6 +83,7 @@ enum qemu_vm_cmd { were previously sent during precopy but are dirty. */ MIG_CMD_PACKAGED, /* Send a wrapped stream within this stream= */ + MIG_CMD_ENABLE_COLO, /* Enable COLO */ MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ MIG_CMD_MAX @@ -841,6 +843,12 @@ static void qemu_savevm_command_send(QEMUFile *f, qemu_fflush(f); } =20 +void qemu_savevm_send_colo_enable(QEMUFile *f) +{ + trace_savevm_send_colo_enable(); + qemu_savevm_command_send(f, MIG_CMD_ENABLE_COLO, 0, NULL); +} + void qemu_savevm_send_ping(QEMUFile *f, uint32_t value) { uint32_t buf; @@ -1922,6 +1930,12 @@ static int loadvm_handle_recv_bitmap(MigrationIncomi= ngState *mis, return 0; } =20 +static int loadvm_process_enable_colo(MigrationIncomingState *mis) +{ + migration_incoming_enable_colo(); + return 0; +} + /* * Process an incoming 'QEMU_VM_COMMAND' * 0 just a normal return @@ -2001,6 +2015,9 @@ static int loadvm_process_command(QEMUFile *f) =20 case MIG_CMD_RECV_BITMAP: return loadvm_handle_recv_bitmap(mis, len); + + case MIG_CMD_ENABLE_COLO: + return loadvm_process_enable_colo(mis); } =20 return 0; diff --git a/migration/savevm.h b/migration/savevm.h index a5e65b8..8373c2f 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -55,6 +55,7 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, c= onst char *name, uint16_t len, uint64_t *start_list, uint64_t *length_list); +void qemu_savevm_send_colo_enable(QEMUFile *f); =20 int qemu_loadvm_state(QEMUFile *f); void qemu_loadvm_state_cleanup(void); diff --git a/migration/trace-events b/migration/trace-events index 9430f3c..fa0ff3f 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -37,6 +37,7 @@ savevm_send_ping(uint32_t val) "0x%x" savevm_send_postcopy_listen(void) "" savevm_send_postcopy_run(void) "" savevm_send_postcopy_resume(void) "" +savevm_send_colo_enable(void) "" savevm_send_recv_bitmap(char *name) "%s" savevm_state_setup(void) "" savevm_state_resume_prepare(void) "" diff --git a/vl.c b/vl.c index 4e25c78..ac3ed17 100644 --- a/vl.c +++ b/vl.c @@ -4365,8 +4365,6 @@ int main(int argc, char **argv, char **envp) #endif } =20 - colo_info_init(); - if (net_init_clients(&err) < 0) { error_report_err(err); exit(1); --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919679986604.8879234930623; Thu, 18 Oct 2018 20:27:59 -0700 (PDT) Received: from localhost ([::1]:46476 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLSA-0006FB-S5 for importer@patchew.org; Thu, 18 Oct 2018 23:27:58 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36506) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNG-0002Sj-0y for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNE-0002mh-1y for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34694) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLND-0002lt-Nv for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:51 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B7E3CC05169D; Fri, 19 Oct 2018 03:22:50 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2F9FF5C69A; Fri, 19 Oct 2018 03:22:47 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:06 +0800 Message-Id: <1539919345-10703-8-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 19 Oct 2018 03:22:50 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 07/26] COLO: Load dirty pages into SVM's RAM cache firstly X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Li Zhijian , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary= side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram ca= che always the same as PVM's memory at every checkpoint, then we flush this cac= hed ram to SVM after we receive all PVM's state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Jason Wang --- include/exec/ram_addr.h | 1 + migration/migration.c | 7 +++++ migration/ram.c | 83 +++++++++++++++++++++++++++++++++++++++++++++= ++-- migration/ram.h | 4 +++ migration/savevm.c | 2 +- 5 files changed, 94 insertions(+), 3 deletions(-) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 3abb639..9ecd911 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -27,6 +27,7 @@ struct RAMBlock { struct rcu_head rcu; struct MemoryRegion *mr; uint8_t *host; + uint8_t *colo_cache; /* For colo, VM's ram cache */ ram_addr_t offset; ram_addr_t used_length; ram_addr_t max_length; diff --git a/migration/migration.c b/migration/migration.c index 215e81a..7696729 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -444,6 +444,11 @@ static void process_incoming_migration_co(void *opaque) exit(EXIT_FAILURE); } =20 + if (colo_init_ram_cache() < 0) { + error_report("Init ram cache failed"); + exit(EXIT_FAILURE); + } + qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming", colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE); mis->have_colo_incoming_thread =3D true; @@ -451,6 +456,8 @@ static void process_incoming_migration_co(void *opaque) =20 /* Wait checkpoint incoming thread exit before free resource */ qemu_thread_join(&mis->colo_incoming_thread); + /* We hold the global iothread lock, so it is safe here */ + colo_release_ram_cache(); } =20 if (ret < 0) { diff --git a/migration/ram.c b/migration/ram.c index bc38d98..cd7a446 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3447,6 +3447,20 @@ static inline void *host_from_ram_block_offset(RAMBl= ock *block, return block->host + offset; } =20 +static inline void *colo_cache_from_block_offset(RAMBlock *block, + ram_addr_t offset) +{ + if (!offset_in_ramblock(block, offset)) { + return NULL; + } + if (!block->colo_cache) { + error_report("%s: colo_cache is NULL in block :%s", + __func__, block->idstr); + return NULL; + } + return block->colo_cache + offset; +} + /** * ram_handle_compressed: handle the zero page case * @@ -3651,6 +3665,58 @@ static void decompress_data_with_multi_threads(QEMUF= ile *f, qemu_mutex_unlock(&decomp_done_lock); } =20 +/* + * colo cache: this is for secondary VM, we cache the whole + * memory of the secondary VM, it is need to hold the global lock + * to call this helper. + */ +int colo_init_ram_cache(void) +{ + RAMBlock *block; + + rcu_read_lock(); + QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + block->colo_cache =3D qemu_anon_ram_alloc(block->used_length, + NULL, + false); + if (!block->colo_cache) { + error_report("%s: Can't alloc memory for COLO cache of block %= s," + "size 0x" RAM_ADDR_FMT, __func__, block->idstr, + block->used_length); + goto out_locked; + } + memcpy(block->colo_cache, block->host, block->used_length); + } + rcu_read_unlock(); + return 0; + +out_locked: + QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + if (block->colo_cache) { + qemu_anon_ram_free(block->colo_cache, block->used_length); + block->colo_cache =3D NULL; + } + } + + rcu_read_unlock(); + return -errno; +} + +/* It is need to hold the global lock to call this helper */ +void colo_release_ram_cache(void) +{ + RAMBlock *block; + + rcu_read_lock(); + QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + if (block->colo_cache) { + qemu_anon_ram_free(block->colo_cache, block->used_length); + block->colo_cache =3D NULL; + } + } + rcu_read_unlock(); +} + /** * ram_load_setup: Setup RAM for migration incoming side * @@ -3667,6 +3733,7 @@ static int ram_load_setup(QEMUFile *f, void *opaque) =20 xbzrle_load_setup(); ramblock_recv_map_init(); + return 0; } =20 @@ -3687,6 +3754,7 @@ static int ram_load_cleanup(void *opaque) g_free(rb->receivedmap); rb->receivedmap =3D NULL; } + return 0; } =20 @@ -3924,13 +3992,24 @@ static int ram_load(QEMUFile *f, void *opaque, int = version_id) RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) { RAMBlock *block =3D ram_block_from_stream(f, flags); =20 - host =3D host_from_ram_block_offset(block, addr); + /* + * After going into COLO, we should load the Page into colo_ca= che. + */ + if (migration_incoming_in_colo_state()) { + host =3D colo_cache_from_block_offset(block, addr); + } else { + host =3D host_from_ram_block_offset(block, addr); + } if (!host) { error_report("Illegal RAM offset " RAM_ADDR_FMT, addr); ret =3D -EINVAL; break; } - ramblock_recv_bitmap_set(block, host); + + if (!migration_incoming_in_colo_state()) { + ramblock_recv_bitmap_set(block, host); + } + trace_ram_load_loop(block->idstr, (uint64_t)addr, flags, host); } =20 diff --git a/migration/ram.h b/migration/ram.h index a139066..83ff1bc 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -71,4 +71,8 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file, const char *block_name); int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb); =20 +/* ram cache */ +int colo_init_ram_cache(void); +void colo_release_ram_cache(void); + #endif diff --git a/migration/savevm.c b/migration/savevm.c index 09ad962..288b807 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1933,7 +1933,7 @@ static int loadvm_handle_recv_bitmap(MigrationIncomin= gState *mis, static int loadvm_process_enable_colo(MigrationIncomingState *mis) { migration_incoming_enable_colo(); - return 0; + return colo_init_ram_cache(); } =20 /* --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 153991985606823.787590571805822; Thu, 18 Oct 2018 20:30:56 -0700 (PDT) Received: from localhost ([::1]:46491 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLUy-0000m8-F6 for importer@patchew.org; Thu, 18 Oct 2018 23:30:52 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36548) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNN-0002XP-Eq for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNJ-0002p0-Nu for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:59 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58986) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNH-0002n7-Mg for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:57 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8717388E58; Fri, 19 Oct 2018 03:22:53 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 433C35C69A; Fri, 19 Oct 2018 03:22:50 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:07 +0800 Message-Id: <1539919345-10703-9-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 19 Oct 2018 03:22:53 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 08/26] ram/COLO: Record the dirty pages that SVM received X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. Here, it is a trick, we record dirty pages by re-using migration dirty bitmap. In the later patch, we will start the dirty log for SVM, just like migration, in this way, we can record both the dirty pages caused by PVM and SVM, we only flush those dirty pages from RAM cache while do checkpoint. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Jason Wang --- migration/ram.c | 43 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index cd7a446..404c8f0 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3458,6 +3458,15 @@ static inline void *colo_cache_from_block_offset(RAM= Block *block, __func__, block->idstr); return NULL; } + + /* + * During colo checkpoint, we need bitmap of these migrated pages. + * It help us to decide which pages in ram cache should be flushed + * into VM's RAM later. + */ + if (!test_and_set_bit(offset >> TARGET_PAGE_BITS, block->bmap)) { + ram_state->migration_dirty_pages++; + } return block->colo_cache + offset; } =20 @@ -3675,7 +3684,7 @@ int colo_init_ram_cache(void) RAMBlock *block; =20 rcu_read_lock(); - QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + RAMBLOCK_FOREACH_MIGRATABLE(block) { block->colo_cache =3D qemu_anon_ram_alloc(block->used_length, NULL, false); @@ -3688,10 +3697,29 @@ int colo_init_ram_cache(void) memcpy(block->colo_cache, block->host, block->used_length); } rcu_read_unlock(); + /* + * Record the dirty pages that sent by PVM, we use this dirty bitmap to= gether + * with to decide which page in cache should be flushed into SVM's RAM.= Here + * we use the same name 'ram_bitmap' as for migration. + */ + if (ram_bytes_total()) { + RAMBlock *block; + + RAMBLOCK_FOREACH_MIGRATABLE(block) { + unsigned long pages =3D block->max_length >> TARGET_PAGE_BITS; + + block->bmap =3D bitmap_new(pages); + bitmap_set(block->bmap, 0, pages); + } + } + ram_state =3D g_new0(RAMState, 1); + ram_state->migration_dirty_pages =3D 0; + return 0; =20 out_locked: - QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + + RAMBLOCK_FOREACH_MIGRATABLE(block) { if (block->colo_cache) { qemu_anon_ram_free(block->colo_cache, block->used_length); block->colo_cache =3D NULL; @@ -3707,14 +3735,23 @@ void colo_release_ram_cache(void) { RAMBlock *block; =20 + RAMBLOCK_FOREACH_MIGRATABLE(block) { + g_free(block->bmap); + block->bmap =3D NULL; + } + rcu_read_lock(); - QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { + + RAMBLOCK_FOREACH_MIGRATABLE(block) { if (block->colo_cache) { qemu_anon_ram_free(block->colo_cache, block->used_length); block->colo_cache =3D NULL; } } + rcu_read_unlock(); + g_free(ram_state); + ram_state =3D NULL; } =20 /** --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919679301483.46616251152295; Thu, 18 Oct 2018 20:27:59 -0700 (PDT) Received: from localhost ([::1]:46475 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLS9-0006Eu-W9 for importer@patchew.org; Thu, 18 Oct 2018 23:27:58 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36573) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNR-0002bI-Il for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNL-0002pm-KM for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:03 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37858) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNJ-0002oO-L4 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:22:57 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 587C9711F7; Fri, 19 Oct 2018 03:22:56 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 109E65C69A; Fri, 19 Oct 2018 03:22:53 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:08 +0800 Message-Id: <1539919345-10703-10-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 19 Oct 2018 03:22:56 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 09/26] COLO: Flush memory data from ram cache X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , zhanghailiang , Li Zhijian , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkp= oint time. So, the content of SVM's RAM cache will always be same with PVM's mem= ory after checkpoint. Instead of flushing all content of PVM's RAM cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM since last checkpoint. In this way, we can ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Jason Wang --- migration/ram.c | 37 +++++++++++++++++++++++++++++++++++++ migration/trace-events | 2 ++ 2 files changed, 39 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 404c8f0..477853d 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3974,6 +3974,39 @@ static bool postcopy_is_running(void) return ps >=3D POSTCOPY_INCOMING_LISTENING && ps < POSTCOPY_INCOMING_E= ND; } =20 +/* + * Flush content of RAM cache into SVM's memory. + * Only flush the pages that be dirtied by PVM or SVM or both. + */ +static void colo_flush_ram_cache(void) +{ + RAMBlock *block =3D NULL; + void *dst_host; + void *src_host; + unsigned long offset =3D 0; + + trace_colo_flush_ram_cache_begin(ram_state->migration_dirty_pages); + rcu_read_lock(); + block =3D QLIST_FIRST_RCU(&ram_list.blocks); + + while (block) { + offset =3D migration_bitmap_find_dirty(ram_state, block, offset); + + if (offset << TARGET_PAGE_BITS >=3D block->used_length) { + offset =3D 0; + block =3D QLIST_NEXT_RCU(block, next); + } else { + migration_bitmap_clear_dirty(ram_state, block, offset); + dst_host =3D block->host + (offset << TARGET_PAGE_BITS); + src_host =3D block->colo_cache + (offset << TARGET_PAGE_BITS); + memcpy(dst_host, src_host, TARGET_PAGE_SIZE); + } + } + + rcu_read_unlock(); + trace_colo_flush_ram_cache_end(); +} + static int ram_load(QEMUFile *f, void *opaque, int version_id) { int flags =3D 0, ret =3D 0, invalid_flags =3D 0; @@ -4150,6 +4183,10 @@ static int ram_load(QEMUFile *f, void *opaque, int v= ersion_id) ret |=3D wait_for_decompress_done(); rcu_read_unlock(); trace_ram_load_complete(ret, seq_iter); + + if (!ret && migration_incoming_in_colo_state()) { + colo_flush_ram_cache(); + } return ret; } =20 diff --git a/migration/trace-events b/migration/trace-events index fa0ff3f..bd2d0cd 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -102,6 +102,8 @@ ram_dirty_bitmap_sync_start(void) "" ram_dirty_bitmap_sync_wait(void) "" ram_dirty_bitmap_sync_complete(void) "" ram_state_resume_prepare(uint64_t v) "%" PRId64 +colo_flush_ram_cache_begin(uint64_t dirty_pages) "dirty_pages %" PRIu64 +colo_flush_ram_cache_end(void) "" =20 # migration/migration.c await_return_path_close_on_source_close(void) "" --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919861389696.7715554553057; Thu, 18 Oct 2018 20:31:01 -0700 (PDT) Received: from localhost ([::1]:46488 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLUx-0000kh-Hp for importer@patchew.org; Thu, 18 Oct 2018 23:30:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36582) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNT-0002dP-Hl for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNP-0002rm-Mv for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:05 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37870) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNN-0002pi-Fb for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:01 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 64C0D4E90B; Fri, 19 Oct 2018 03:22:59 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id D6807171A8; Fri, 19 Oct 2018 03:22:56 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:09 +0800 Message-Id: <1539919345-10703-11-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 19 Oct 2018 03:22:59 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 10/26] qmp event: Add COLO_EXIT event to notify users while exited COLO X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Li Zhijian , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: zhanghailiang If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x-colo-lost-heartbeat', Users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users that we exited COLO mode. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- migration/colo.c | 31 +++++++++++++++++++++++++++++++ qapi/migration.json | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index d3163b5..bd7390d 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -28,6 +28,7 @@ #include "net/colo-compare.h" #include "net/colo.h" #include "block/block.h" +#include "qapi/qapi-events-migration.h" =20 static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -514,6 +515,23 @@ out: qemu_fclose(fb); } =20 + /* + * There are only two reasons we can get here, some error happened + * or the user triggered failover. + */ + switch (failover_get_state()) { + case FAILOVER_STATUS_NONE: + qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_ERROR); + break; + case FAILOVER_STATUS_REQUIRE: + qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_REQUEST); + break; + default: + abort(); + } + /* Hope this not to be too long to wait here */ qemu_sem_wait(&s->colo_exit_sem); qemu_sem_destroy(&s->colo_exit_sem); @@ -746,6 +764,19 @@ out: error_report_err(local_err); } =20 + switch (failover_get_state()) { + case FAILOVER_STATUS_NONE: + qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_ERROR); + break; + case FAILOVER_STATUS_REQUIRE: + qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_REQUEST); + break; + default: + abort(); + } + if (fb) { qemu_fclose(fb); } diff --git a/qapi/migration.json b/qapi/migration.json index 6e8c212..4a18209 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -957,6 +957,44 @@ 'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] } =20 ## +# @COLO_EXIT: +# +# Emitted when VM finishes COLO mode due to some errors happening or +# at the request of users. +# +# @mode: report COLO mode when COLO exited. +# +# @reason: describes the reason for the COLO exit. +# +# Since: 3.1 +# +# Example: +# +# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172}, +# "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "reques= t" } } +# +## +{ 'event': 'COLO_EXIT', + 'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } } + +## +# @COLOExitReason: +# +# The reason for a COLO exit +# +# @none: no failover has ever happened. This can't occur in the +# COLO_EXIT event, only in the result of query-colo-status. +# +# @request: COLO exit is due to an external request +# +# @error: COLO exit is due to an internal error +# +# Since: 3.1 +## +{ 'enum': 'COLOExitReason', + 'data': [ 'none', 'request', 'error' ] } + +## # @x-colo-lost-heartbeat: # # Tell qemu that heartbeat is lost, request it to do takeover procedures. --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920058029642.8304214372683; Thu, 18 Oct 2018 20:34:18 -0700 (PDT) Received: from localhost ([::1]:46505 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLYG-0003o7-D8 for importer@patchew.org; Thu, 18 Oct 2018 23:34:16 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36603) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNV-0002fZ-N1 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNT-0002tI-KM for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58122) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNQ-0002qx-VE for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:05 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id ED40F30F5594; Fri, 19 Oct 2018 03:23:01 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id E5ED55C69A; Fri, 19 Oct 2018 03:22:59 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:10 +0800 Message-Id: <1539919345-10703-12-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 19 Oct 2018 03:23:02 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 11/26] qapi/migration.json: Rename COLO unknown mode to none mode. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen Suggested by Markus Armbruster rename COLO unknown mode to none mode. Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Eric Blake Reviewed-by: Markus Armbruster Signed-off-by: Jason Wang --- migration/colo-failover.c | 2 +- migration/colo.c | 2 +- qapi/migration.json | 10 +++++----- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/migration/colo-failover.c b/migration/colo-failover.c index 0ae0c41..4854a96 100644 --- a/migration/colo-failover.c +++ b/migration/colo-failover.c @@ -77,7 +77,7 @@ FailoverStatus failover_get_state(void) =20 void qmp_x_colo_lost_heartbeat(Error **errp) { - if (get_colo_mode() =3D=3D COLO_MODE_UNKNOWN) { + if (get_colo_mode() =3D=3D COLO_MODE_NONE) { error_setg(errp, QERR_FEATURE_DISABLED, "colo"); return; } diff --git a/migration/colo.c b/migration/colo.c index bd7390d..2cdd366 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -160,7 +160,7 @@ COLOMode get_colo_mode(void) } else if (migration_incoming_in_colo_state()) { return COLO_MODE_SECONDARY; } else { - return COLO_MODE_UNKNOWN; + return COLO_MODE_NONE; } } =20 diff --git a/qapi/migration.json b/qapi/migration.json index 4a18209..0776e0f 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -923,18 +923,18 @@ ## # @COLOMode: # -# The colo mode +# The COLO current mode. # -# @unknown: unknown mode +# @none: COLO is disabled. # -# @primary: master side +# @primary: COLO node in primary side. # -# @secondary: slave side +# @secondary: COLO node in slave side. # # Since: 2.8 ## { 'enum': 'COLOMode', - 'data': [ 'unknown', 'primary', 'secondary'] } + 'data': [ 'none', 'primary', 'secondary'] } =20 ## # @FailoverStatus: --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920224067253.16004969723087; Thu, 18 Oct 2018 20:37:04 -0700 (PDT) Received: from localhost ([::1]:46522 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLas-0005rY-DG for importer@patchew.org; Thu, 18 Oct 2018 23:36:58 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36681) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNZ-0002jJ-Nj for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNV-0002u3-Gt for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60412) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNS-0002sM-W7 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:07 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7073F30842AC; Fri, 19 Oct 2018 03:23:04 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 70FFA5C69A; Fri, 19 Oct 2018 03:23:02 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:11 +0800 Message-Id: <1539919345-10703-13-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Fri, 19 Oct 2018 03:23:04 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 12/26] qapi: Add new command to query colo status X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen Libvirt or other high level software can use this command query colo status. You can test this command like that: {'execute':'query-colo-status'} Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- migration/colo.c | 21 +++++++++++++++++++++ qapi/migration.json | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 53 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 2cdd366..94c4e09 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -29,6 +29,7 @@ #include "net/colo.h" #include "block/block.h" #include "qapi/qapi-events-migration.h" +#include "qapi/qmp/qerror.h" =20 static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -237,6 +238,26 @@ void qmp_xen_colo_do_checkpoint(Error **errp) #endif } =20 +COLOStatus *qmp_query_colo_status(Error **errp) +{ + COLOStatus *s =3D g_new0(COLOStatus, 1); + + s->mode =3D get_colo_mode(); + + switch (failover_get_state()) { + case FAILOVER_STATUS_NONE: + s->reason =3D COLO_EXIT_REASON_NONE; + break; + case FAILOVER_STATUS_REQUIRE: + s->reason =3D COLO_EXIT_REASON_REQUEST; + break; + default: + s->reason =3D COLO_EXIT_REASON_ERROR; + } + + return s; +} + static void colo_send_message(QEMUFile *f, COLOMessage msg, Error **errp) { diff --git a/qapi/migration.json b/qapi/migration.json index 0776e0f..0928f4b 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1308,6 +1308,38 @@ { 'command': 'xen-colo-do-checkpoint' } =20 ## +# @COLOStatus: +# +# The result format for 'query-colo-status'. +# +# @mode: COLO running mode. If COLO is running, this field will return +# 'primary' or 'secondary'. +# +# @reason: describes the reason for the COLO exit. +# +# Since: 3.0 +## +{ 'struct': 'COLOStatus', + 'data': { 'mode': 'COLOMode', 'reason': 'COLOExitReason' } } + +## +# @query-colo-status: +# +# Query COLO status while the vm is running. +# +# Returns: A @COLOStatus object showing the status. +# +# Example: +# +# -> { "execute": "query-colo-status" } +# <- { "return": { "mode": "primary", "active": true, "reason": "request" = } } +# +# Since: 3.0 +## +{ 'command': 'query-colo-status', + 'returns': 'COLOStatus' } + +## # @migrate-recover: # # Provide a recovery migration stream URI. --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920061660241.01782521906046; Thu, 18 Oct 2018 20:34:21 -0700 (PDT) Received: from localhost ([::1]:46503 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLYK-0003nx-FE for importer@patchew.org; Thu, 18 Oct 2018 23:34:20 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36680) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNZ-0002jI-Nj for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNW-0002vI-3F for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59026) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNV-0002tQ-MM for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:09 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 846C07F6CE; Fri, 19 Oct 2018 03:23:07 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id F16825C69A; Fri, 19 Oct 2018 03:23:04 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:12 +0800 Message-Id: <1539919345-10703-14-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 19 Oct 2018 03:23:07 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 13/26] savevm: split the process of different stages for loadvm/savevm X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Li Zhijian , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public, and simplify the codes of qemu_save_device_state() by calling the wrapper qemu_savevm_state_header(). Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Jason Wang --- migration/colo.c | 41 ++++++++++++++++++++++++++++++++--------- migration/savevm.c | 36 +++++++++++++++++++++++++++++------- migration/savevm.h | 4 ++++ 3 files changed, 65 insertions(+), 16 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 94c4e09..59bb507 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -30,6 +30,7 @@ #include "block/block.h" #include "qapi/qapi-events-migration.h" #include "qapi/qmp/qerror.h" +#include "sysemu/cpus.h" =20 static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -401,24 +402,35 @@ static int colo_do_checkpoint_transaction(MigrationSt= ate *s, =20 /* Disable block migration */ migrate_set_block_enabled(false, &local_err); - qemu_savevm_state_header(fb); - qemu_savevm_state_setup(fb); qemu_mutex_lock_iothread(); replication_do_checkpoint_all(&local_err); if (local_err) { qemu_mutex_unlock_iothread(); goto out; } - qemu_savevm_state_complete_precopy(fb, false, false); - qemu_mutex_unlock_iothread(); - - qemu_fflush(fb); =20 colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_er= r); if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } + /* Note: device state is saved into buffer */ + ret =3D qemu_save_device_state(fb); + + qemu_mutex_unlock_iothread(); + if (ret < 0) { goto out; } /* + * Only save VM's live state, which not including device state. + * TODO: We may need a timeout mechanism to prevent COLO process + * to be blocked here. + */ + qemu_savevm_live_state(s->to_dst_file); + + qemu_fflush(fb); + + /* * We need the size of the VMstate data in Secondary side, * With which we can decide how much data should be read. */ @@ -635,6 +647,7 @@ void *colo_process_incoming_thread(void *opaque) uint64_t total_size; uint64_t value; Error *local_err =3D NULL; + int ret; =20 rcu_register_thread(); qemu_sem_init(&mis->colo_incoming_sem, 0); @@ -708,6 +721,16 @@ void *colo_process_incoming_thread(void *opaque) goto out; } =20 + qemu_mutex_lock_iothread(); + cpu_synchronize_all_pre_loadvm(); + ret =3D qemu_loadvm_state_main(mis->from_src_file, mis); + qemu_mutex_unlock_iothread(); + + if (ret < 0) { + error_report("Load VM's live state (ram) error"); + goto out; + } + value =3D colo_receive_message_value(mis->from_src_file, COLO_MESSAGE_VMSTATE_SIZE, &local_err); if (local_err) { @@ -739,10 +762,10 @@ void *colo_process_incoming_thread(void *opaque) } =20 qemu_mutex_lock_iothread(); - qemu_system_reset(SHUTDOWN_CAUSE_NONE); vmstate_loading =3D true; - if (qemu_loadvm_state(fb) < 0) { - error_report("COLO: loadvm failed"); + ret =3D qemu_load_device_state(fb); + if (ret < 0) { + error_report("COLO: load device state failed"); qemu_mutex_unlock_iothread(); goto out; } diff --git a/migration/savevm.c b/migration/savevm.c index 288b807..e4caff9 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1378,13 +1378,21 @@ done: return ret; } =20 -static int qemu_save_device_state(QEMUFile *f) +void qemu_savevm_live_state(QEMUFile *f) { - SaveStateEntry *se; + /* save QEMU_VM_SECTION_END section */ + qemu_savevm_state_complete_precopy(f, true, false); + qemu_put_byte(f, QEMU_VM_EOF); +} =20 - qemu_put_be32(f, QEMU_VM_FILE_MAGIC); - qemu_put_be32(f, QEMU_VM_FILE_VERSION); +int qemu_save_device_state(QEMUFile *f) +{ + SaveStateEntry *se; =20 + if (!migration_in_colo_state()) { + qemu_put_be32(f, QEMU_VM_FILE_MAGIC); + qemu_put_be32(f, QEMU_VM_FILE_VERSION); + } cpu_synchronize_all_states(); =20 QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { @@ -1440,8 +1448,6 @@ enum LoadVMExitCodes { LOADVM_QUIT =3D 1, }; =20 -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis= ); - /* ------ incoming postcopy messages ------ */ /* 'advise' arrives before any transfers just to tell us that a postcopy * *might* happen - it might be skipped if precopy transferred everything @@ -2247,7 +2253,7 @@ static bool postcopy_pause_incoming(MigrationIncoming= State *mis) return true; } =20 -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis) +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis) { uint8_t section_type; int ret =3D 0; @@ -2418,6 +2424,22 @@ int qemu_loadvm_state(QEMUFile *f) return ret; } =20 +int qemu_load_device_state(QEMUFile *f) +{ + MigrationIncomingState *mis =3D migration_incoming_get_current(); + int ret; + + /* Load QEMU_VM_SECTION_FULL section */ + ret =3D qemu_loadvm_state_main(f, mis); + if (ret < 0) { + error_report("Failed to load device state: %d", ret); + return ret; + } + + cpu_synchronize_all_post_init(); + return 0; +} + int save_snapshot(const char *name, Error **errp) { BlockDriverState *bs, *bs1; diff --git a/migration/savevm.h b/migration/savevm.h index 8373c2f..51a4b9c 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -56,8 +56,12 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, = const char *name, uint64_t *start_list, uint64_t *length_list); void qemu_savevm_send_colo_enable(QEMUFile *f); +void qemu_savevm_live_state(QEMUFile *f); +int qemu_save_device_state(QEMUFile *f); =20 int qemu_loadvm_state(QEMUFile *f); void qemu_loadvm_state_cleanup(void); +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis); +int qemu_load_device_state(QEMUFile *f); =20 #endif --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539919863274414.41594316475766; Thu, 18 Oct 2018 20:31:03 -0700 (PDT) Received: from localhost ([::1]:46489 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLUx-0000kq-6Z for importer@patchew.org; Thu, 18 Oct 2018 23:30:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36679) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNZ-0002jH-Na for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNX-0002xF-Qs for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35362) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNX-0002w3-HJ for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:11 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 979DC30DF710; Fri, 19 Oct 2018 03:23:10 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1100C5C69A; Fri, 19 Oct 2018 03:23:07 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:13 +0800 Message-Id: <1539919345-10703-15-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 19 Oct 2018 03:23:10 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 14/26] COLO: flush host dirty ram from cache X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Li Zhijian , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: zhanghailiang Don't need to flush all VM's ram from cache, only flush the dirty pages since last checkpoint Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Jason Wang --- migration/ram.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 477853d..7e7deec 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3714,6 +3714,7 @@ int colo_init_ram_cache(void) } ram_state =3D g_new0(RAMState, 1); ram_state->migration_dirty_pages =3D 0; + memory_global_dirty_log_start(); =20 return 0; =20 @@ -3735,6 +3736,7 @@ void colo_release_ram_cache(void) { RAMBlock *block; =20 + memory_global_dirty_log_stop(); RAMBLOCK_FOREACH_MIGRATABLE(block) { g_free(block->bmap); block->bmap =3D NULL; @@ -3985,6 +3987,13 @@ static void colo_flush_ram_cache(void) void *src_host; unsigned long offset =3D 0; =20 + memory_global_dirty_log_sync(); + rcu_read_lock(); + RAMBLOCK_FOREACH_MIGRATABLE(block) { + migration_bitmap_sync_range(ram_state, block, 0, block->used_lengt= h); + } + rcu_read_unlock(); + trace_colo_flush_ram_cache_begin(ram_state->migration_dirty_pages); rcu_read_lock(); block =3D QLIST_FIRST_RCU(&ram_list.blocks); --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920231298936.8200029734347; Thu, 18 Oct 2018 20:37:11 -0700 (PDT) Received: from localhost ([::1]:46523 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLat-0005uY-ST for importer@patchew.org; Thu, 18 Oct 2018 23:36:59 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36720) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNg-0002nh-E3 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNa-000314-Tg for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35412) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNa-0002zh-A9 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:14 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6512D309EAB7; Fri, 19 Oct 2018 03:23:13 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 21A41171A8; Fri, 19 Oct 2018 03:23:10 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:14 +0800 Message-Id: <1539919345-10703-16-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 19 Oct 2018 03:23:13 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 15/26] filter: Add handle_event method for NetFilterClass X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen Filter needs to process the event of checkpoint/failover or other event passed by COLO frame. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- include/net/filter.h | 5 +++++ net/filter.c | 17 +++++++++++++++++ net/net.c | 19 +++++++++++++++++++ 3 files changed, 41 insertions(+) diff --git a/include/net/filter.h b/include/net/filter.h index 435acd6..49da666 100644 --- a/include/net/filter.h +++ b/include/net/filter.h @@ -38,6 +38,8 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc, =20 typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp); =20 +typedef void (FilterHandleEvent) (NetFilterState *nf, int event, Error **e= rrp); + typedef struct NetFilterClass { ObjectClass parent_class; =20 @@ -45,6 +47,7 @@ typedef struct NetFilterClass { FilterSetup *setup; FilterCleanup *cleanup; FilterStatusChanged *status_changed; + FilterHandleEvent *handle_event; /* mandatory */ FilterReceiveIOV *receive_iov; } NetFilterClass; @@ -77,4 +80,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sende= r, int iovcnt, void *opaque); =20 +void colo_notify_filters_event(int event, Error **errp); + #endif /* QEMU_NET_FILTER_H */ diff --git a/net/filter.c b/net/filter.c index 2fd7d7d..c9f9e5f 100644 --- a/net/filter.c +++ b/net/filter.c @@ -17,6 +17,8 @@ #include "net/vhost_net.h" #include "qom/object_interfaces.h" #include "qemu/iov.h" +#include "net/colo.h" +#include "migration/colo.h" =20 static inline bool qemu_can_skip_netfilter(NetFilterState *nf) { @@ -245,11 +247,26 @@ static void netfilter_finalize(Object *obj) g_free(nf->netdev_id); } =20 +static void default_handle_event(NetFilterState *nf, int event, Error **er= rp) +{ + switch (event) { + case COLO_EVENT_CHECKPOINT: + break; + case COLO_EVENT_FAILOVER: + object_property_set_str(OBJECT(nf), "off", "status", errp); + break; + default: + break; + } +} + static void netfilter_class_init(ObjectClass *oc, void *data) { UserCreatableClass *ucc =3D USER_CREATABLE_CLASS(oc); + NetFilterClass *nfc =3D NETFILTER_CLASS(oc); =20 ucc->complete =3D netfilter_complete; + nfc->handle_event =3D default_handle_event; } =20 static const TypeInfo netfilter_info =3D { diff --git a/net/net.c b/net/net.c index cdcd5cf..c66847e 100644 --- a/net/net.c +++ b/net/net.c @@ -1335,6 +1335,25 @@ void hmp_info_network(Monitor *mon, const QDict *qdi= ct) } } =20 +void colo_notify_filters_event(int event, Error **errp) +{ + NetClientState *nc; + NetFilterState *nf; + NetFilterClass *nfc =3D NULL; + Error *local_err =3D NULL; + + QTAILQ_FOREACH(nc, &net_clients, next) { + QTAILQ_FOREACH(nf, &nc->filters, next) { + nfc =3D NETFILTER_GET_CLASS(OBJECT(nf)); + nfc->handle_event(nf, event, &local_err); + if (local_err) { + error_propagate(errp, local_err); + return; + } + } + } +} + void qmp_set_link(const char *name, bool up, Error **errp) { NetClientState *ncs[MAX_QUEUE_NUM]; --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920345711830.0197623849166; Thu, 18 Oct 2018 20:39:05 -0700 (PDT) Received: from localhost ([::1]:46534 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLcs-00082I-A0 for importer@patchew.org; Thu, 18 Oct 2018 23:39:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36816) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNw-00031e-6Z for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNq-00038M-5I for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:34 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46604) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNj-00031a-3Z for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:26 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 30A4F30F8BF0; Fri, 19 Oct 2018 03:23:16 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id E4016171A8; Fri, 19 Oct 2018 03:23:13 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:15 +0800 Message-Id: <1539919345-10703-17-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Fri, 19 Oct 2018 03:23:16 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 16/26] filter-rewriter: handle checkpoint and failover event X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , zhanghailiang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen After one round of checkpoint, the states between PVM and SVM become consistent, so it is unnecessary to adjust the sequence of net packets for old connections, besides, while failover happens, filter-rewriter will into failover mode that needn't handle the new TCP connection. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- net/colo-compare.c | 12 +++++------ net/colo.c | 8 ++++++++ net/colo.h | 2 ++ net/filter-rewriter.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++= ++++ 4 files changed, 73 insertions(+), 6 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 3f7e240..a39191d 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -116,6 +116,12 @@ enum { SECONDARY_IN, }; =20 +static void colo_compare_inconsistency_notify(void) +{ + notifier_list_notify(&colo_compare_notifiers, + migrate_get_current()); +} + static int compare_chr_send(CompareState *s, const uint8_t *buf, uint32_t size, @@ -330,12 +336,6 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet *sp= kt, return false; } =20 -static void colo_compare_inconsistency_notify(void) -{ - notifier_list_notify(&colo_compare_notifiers, - migrate_get_current()); -} - static void colo_compare_tcp(CompareState *s, Connection *conn) { Packet *ppkt =3D NULL, *spkt =3D NULL; diff --git a/net/colo.c b/net/colo.c index 97c8fc9..49176bf 100644 --- a/net/colo.c +++ b/net/colo.c @@ -221,3 +221,11 @@ Connection *connection_get(GHashTable *connection_trac= k_table, =20 return conn; } + +bool connection_has_tracked(GHashTable *connection_track_table, + ConnectionKey *key) +{ + Connection *conn =3D g_hash_table_lookup(connection_track_table, key); + + return conn ? true : false; +} diff --git a/net/colo.h b/net/colo.h index 0277e0e..11c5226 100644 --- a/net/colo.h +++ b/net/colo.h @@ -98,6 +98,8 @@ void connection_destroy(void *opaque); Connection *connection_get(GHashTable *connection_track_table, ConnectionKey *key, GQueue *conn_list); +bool connection_has_tracked(GHashTable *connection_track_table, + ConnectionKey *key); void connection_hashtable_reset(GHashTable *connection_track_table); Packet *packet_new(const void *data, int size, int vnet_hdr_len); void packet_destroy(void *opaque, void *user_data); diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index dd323fa..bb8f4d9 100644 --- a/net/filter-rewriter.c +++ b/net/filter-rewriter.c @@ -20,11 +20,15 @@ #include "qemu/main-loop.h" #include "qemu/iov.h" #include "net/checksum.h" +#include "net/colo.h" +#include "migration/colo.h" =20 #define FILTER_COLO_REWRITER(obj) \ OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER) =20 #define TYPE_FILTER_REWRITER "filter-rewriter" +#define FAILOVER_MODE_ON true +#define FAILOVER_MODE_OFF false =20 typedef struct RewriterState { NetFilterState parent_obj; @@ -32,8 +36,14 @@ typedef struct RewriterState { /* hashtable to save connection */ GHashTable *connection_track_table; bool vnet_hdr; + bool failover_mode; } RewriterState; =20 +static void filter_rewriter_failover_mode(RewriterState *s) +{ + s->failover_mode =3D FAILOVER_MODE_ON; +} + static void filter_rewriter_flush(NetFilterState *nf) { RewriterState *s =3D FILTER_COLO_REWRITER(nf); @@ -273,6 +283,13 @@ static ssize_t colo_rewriter_receive_iov(NetFilterStat= e *nf, */ reverse_connection_key(&key); } + + /* After failover we needn't change new TCP packet */ + if (s->failover_mode && + !connection_has_tracked(s->connection_track_table, &key)) { + goto out; + } + conn =3D connection_get(s->connection_track_table, &key, NULL); @@ -306,11 +323,49 @@ static ssize_t colo_rewriter_receive_iov(NetFilterSta= te *nf, } } =20 +out: packet_destroy(pkt, NULL); pkt =3D NULL; return 0; } =20 +static void reset_seq_offset(gpointer key, gpointer value, gpointer user_d= ata) +{ + Connection *conn =3D (Connection *)value; + + conn->offset =3D 0; +} + +static gboolean offset_is_nonzero(gpointer key, + gpointer value, + gpointer user_data) +{ + Connection *conn =3D (Connection *)value; + + return conn->offset ? true : false; +} + +static void colo_rewriter_handle_event(NetFilterState *nf, int event, + Error **errp) +{ + RewriterState *rs =3D FILTER_COLO_REWRITER(nf); + + switch (event) { + case COLO_EVENT_CHECKPOINT: + g_hash_table_foreach(rs->connection_track_table, + reset_seq_offset, NULL); + break; + case COLO_EVENT_FAILOVER: + if (!g_hash_table_find(rs->connection_track_table, + offset_is_nonzero, NULL)) { + filter_rewriter_failover_mode(rs); + } + break; + default: + break; + } +} + static void colo_rewriter_cleanup(NetFilterState *nf) { RewriterState *s =3D FILTER_COLO_REWRITER(nf); @@ -354,6 +409,7 @@ static void filter_rewriter_init(Object *obj) RewriterState *s =3D FILTER_COLO_REWRITER(obj); =20 s->vnet_hdr =3D false; + s->failover_mode =3D FAILOVER_MODE_OFF; object_property_add_bool(obj, "vnet_hdr_support", filter_rewriter_get_vnet_hdr, filter_rewriter_set_vnet_hdr, NULL); @@ -366,6 +422,7 @@ static void colo_rewriter_class_init(ObjectClass *oc, v= oid *data) nfc->setup =3D colo_rewriter_setup; nfc->cleanup =3D colo_rewriter_cleanup; nfc->receive_iov =3D colo_rewriter_receive_iov; + nfc->handle_event =3D colo_rewriter_handle_event; } =20 static const TypeInfo colo_rewriter_info =3D { --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 153992012672135.514037849966826; Thu, 18 Oct 2018 20:35:26 -0700 (PDT) Received: from localhost ([::1]:46506 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLZN-0004dp-Ik for importer@patchew.org; Thu, 18 Oct 2018 23:35:25 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36817) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNw-00031k-6w for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNs-00039s-51 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:34 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46616) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNm-00032w-P5 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:28 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6BB5E3001740; Fri, 19 Oct 2018 03:23:18 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id B13A15C69A; Fri, 19 Oct 2018 03:23:16 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:16 +0800 Message-Id: <1539919345-10703-18-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Fri, 19 Oct 2018 03:23:18 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 17/26] COLO: notify net filters about checkpoint/failover event X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , zhanghailiang Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: zhanghailiang Notify all net filters about the checkpoint and failover event. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Jason Wang --- migration/colo.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 59bb507..57a8542 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -31,6 +31,7 @@ #include "qapi/qapi-events-migration.h" #include "qapi/qmp/qerror.h" #include "sysemu/cpus.h" +#include "net/filter.h" =20 static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -83,6 +84,12 @@ static void secondary_vm_do_failover(void) error_report_err(local_err); } =20 + /* Notify all filters of all NIC to do checkpoint */ + colo_notify_filters_event(COLO_EVENT_FAILOVER, &local_err); + if (local_err) { + error_report_err(local_err); + } + if (!autostart) { error_report("\"-S\" qemu option will be ignored in secondary side= "); /* recover runstate to normal migration finish state */ @@ -782,6 +789,14 @@ void *colo_process_incoming_thread(void *opaque) goto out; } =20 + /* Notify all filters of all NIC to do checkpoint */ + colo_notify_filters_event(COLO_EVENT_CHECKPOINT, &local_err); + + if (local_err) { + qemu_mutex_unlock_iothread(); + goto out; + } + vmstate_loading =3D false; vm_start(); trace_colo_vm_state_change("stop", "run"); --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920451780194.43776428594924; Thu, 18 Oct 2018 20:40:51 -0700 (PDT) Received: from localhost ([::1]:46544 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLec-0000nZ-II for importer@patchew.org; Thu, 18 Oct 2018 23:40:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36873) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNx-00032n-Ai for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNw-0003Fz-C4 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37934) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNu-00034X-9u for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:36 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E56EE711F7; Fri, 19 Oct 2018 03:23:21 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id EBA0E171A8; Fri, 19 Oct 2018 03:23:18 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:17 +0800 Message-Id: <1539919345-10703-19-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 19 Oct 2018 03:23:21 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 18/26] COLO: quick failover process by kick COLO thread X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , zhanghailiang Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: zhanghailiang COLO thread may sleep at qemu_sem_wait(&s->colo_checkpoint_sem), while failover works begin, It's better to wakeup it to quick the process. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Jason Wang --- migration/colo.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 57a8542..956ac23 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -131,6 +131,11 @@ static void primary_vm_do_failover(void) =20 migrate_set_state(&s->state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); + /* + * kick COLO thread which might wait at + * qemu_sem_wait(&s->colo_checkpoint_sem). + */ + colo_checkpoint_notify(migrate_get_current()); =20 /* * Wake up COLO thread which may blocked in recv() or send(), @@ -539,6 +544,9 @@ static void colo_process_checkpoint(MigrationState *s) =20 qemu_sem_wait(&s->colo_checkpoint_sem); =20 + if (s->state !=3D MIGRATION_STATUS_COLO) { + goto out; + } ret =3D colo_do_checkpoint_transaction(s, bioc, fb); if (ret < 0) { goto out; --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920453975441.8903756679309; Thu, 18 Oct 2018 20:40:53 -0700 (PDT) Received: from localhost ([::1]:46545 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLee-0000p6-Vi for importer@patchew.org; Thu, 18 Oct 2018 23:40:53 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36902) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNy-00033P-0c for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNx-0003I4-18 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58188) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNw-00035K-C6 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:36 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 74F1630F5580; Fri, 19 Oct 2018 03:23:24 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7348C5C69A; Fri, 19 Oct 2018 03:23:22 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:18 +0800 Message-Id: <1539919345-10703-20-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 19 Oct 2018 03:23:24 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 19/26] docs: Add COLO status diagram to COLO-FT.txt X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Jason Wang , Zhang Chen Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Zhang Chen This diagram make user better understand COLO. Suggested by Markus Armbruster. Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- docs/COLO-FT.txt | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt index 70cfb9c..6302469 100644 --- a/docs/COLO-FT.txt +++ b/docs/COLO-FT.txt @@ -110,6 +110,40 @@ Note: HeartBeat has not been implemented yet, so you need to trigger failover pr= ocess by using 'x-colo-lost-heartbeat' command. =20 +=3D=3D COLO operation status =3D=3D + ++-----------------+ +| | +| Start COLO | +| | ++--------+--------+ + | + | Main qmp command: + | migrate-set-capabilities with x-colo + | migrate + | + v ++--------+--------+ +| | +| COLO running | +| | ++--------+--------+ + | + | Main qmp command: + | x-colo-lost-heartbeat + | or + | some error happened + v ++--------+--------+ +| | send qmp event: +| COLO failover | COLO_EXIT +| | ++-----------------+ + +COLO use the qmp command to switch and report operation status. +The diagram just shows the main qmp command, you can get the detail +in test procedure. + =3D=3D Test procedure =3D=3D 1. Startup qemu Primary: --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920345592254.95851665513612; Thu, 18 Oct 2018 20:39:05 -0700 (PDT) Received: from localhost ([::1]:46533 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLcq-00080p-HB for importer@patchew.org; Thu, 18 Oct 2018 23:39:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36865) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNx-00032Z-59 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNw-0003F2-2t for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:36 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58616) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNs-00036C-50; Thu, 18 Oct 2018 23:23:34 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0035F5F74C; Fri, 19 Oct 2018 03:23:27 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 01C705C69A; Fri, 19 Oct 2018 03:23:24 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:19 +0800 Message-Id: <1539919345-10703-21-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 19 Oct 2018 03:23:27 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 20/26] clean up callback when del virtqueue X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , qemu-stable@nongnu.org, liujunjie Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: liujunjie Before, we did not clear callback like handle_output when delete the virtqueue which may result be segmentfault. The scene is as follows: 1. Start a vm with multiqueue vhost-net, 2. then we write VIRTIO_PCI_GUEST_FEATURES in PCI configuration to triger multiqueue disable in this vm which will delete the virtqueue. In this step, the tx_bh is deleted but the callback virtio_net_handle_tx_bh still exist. 3. Finally, we write VIRTIO_PCI_QUEUE_NOTIFY in PCI configuration to notify the deleted virtqueue. In this way, virtio_net_handle_tx_bh will be called and qemu will be crashed. Although the way described above is uncommon, we had better reinforce it. CC: qemu-stable@nongnu.org Signed-off-by: liujunjie Signed-off-by: Jason Wang --- hw/virtio/virtio.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 4e61944..4136d23 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -1611,6 +1611,8 @@ void virtio_del_queue(VirtIODevice *vdev, int n) =20 vdev->vq[n].vring.num =3D 0; vdev->vq[n].vring.num_default =3D 0; + vdev->vq[n].handle_output =3D NULL; + vdev->vq[n].handle_aio_output =3D NULL; } =20 static void virtio_set_isr(VirtIODevice *vdev, int value) --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920570695378.0962074588068; Thu, 18 Oct 2018 20:42:50 -0700 (PDT) Received: from localhost ([::1]:46550 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLgM-0001xW-40 for importer@patchew.org; Thu, 18 Oct 2018 23:42:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36897) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNx-00033H-SI for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNw-0003HR-Qa for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38086) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNw-00037y-8m; Thu, 18 Oct 2018 23:23:36 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3A30688304; Fri, 19 Oct 2018 03:23:29 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 804955C69A; Fri, 19 Oct 2018 03:23:27 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:20 +0800 Message-Id: <1539919345-10703-22-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Fri, 19 Oct 2018 03:23:29 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 21/26] ne2000: fix possible out of bound access in ne2000_receive X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , qemu-stable@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" In ne2000_receive(), we try to assign size_ to size which converts from size_t to integer. This will cause troubles when size_ is greater INT_MAX, this will lead a negative value in size and it can then pass the check of size < MIN_BUF_SIZE which may lead out of bound access of for both buf and buf1. Fixing by converting the type of size to size_t. CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira Reviewed-by: Michael S. Tsirkin Signed-off-by: Jason Wang --- hw/net/ne2000.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c index 07d79e3..869518e 100644 --- a/hw/net/ne2000.c +++ b/hw/net/ne2000.c @@ -174,7 +174,7 @@ static int ne2000_buffer_full(NE2000State *s) ssize_t ne2000_receive(NetClientState *nc, const uint8_t *buf, size_t size= _) { NE2000State *s =3D qemu_get_nic_opaque(nc); - int size =3D size_; + size_t size =3D size_; uint8_t *p; unsigned int total_len, next, avail, len, index, mcast_idx; uint8_t buf1[60]; @@ -182,7 +182,7 @@ ssize_t ne2000_receive(NetClientState *nc, const uint8_= t *buf, size_t size_) { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; =20 #if defined(DEBUG_NE2000) - printf("NE2000: received len=3D%d\n", size); + printf("NE2000: received len=3D%zu\n", size); #endif =20 if (s->cmd & E8390_STOP || ne2000_buffer_full(s)) --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920660512428.0052737527751; Thu, 18 Oct 2018 20:44:20 -0700 (PDT) Received: from localhost ([::1]:46556 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLhz-00043Z-FL for importer@patchew.org; Thu, 18 Oct 2018 23:44:19 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36924) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNy-00033m-Az for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNx-0003IK-49 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37954) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNw-00039B-KQ; Thu, 18 Oct 2018 23:23:36 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7B376CA38E; Fri, 19 Oct 2018 03:23:31 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id BBE065C69A; Fri, 19 Oct 2018 03:23:29 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:21 +0800 Message-Id: <1539919345-10703-23-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 19 Oct 2018 03:23:31 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 22/26] rtl8139: fix possible out of bound access X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , qemu-stable@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" In rtl8139_do_receive(), we try to assign size_ to size which converts from size_t to integer. This will cause troubles when size_ is greater INT_MAX, this will lead a negative value in size and it can then pass the check of size < MIN_BUF_SIZE which may lead out of bound access of for both buf and buf1. Fixing by converting the type of size to size_t. CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira Reviewed-by: Michael S. Tsirkin Signed-off-by: Jason Wang --- hw/net/rtl8139.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c index 46daa16..2342a09 100644 --- a/hw/net/rtl8139.c +++ b/hw/net/rtl8139.c @@ -817,7 +817,7 @@ static ssize_t rtl8139_do_receive(NetClientState *nc, c= onst uint8_t *buf, size_t RTL8139State *s =3D qemu_get_nic_opaque(nc); PCIDevice *d =3D PCI_DEVICE(s); /* size is the length of the buffer passed to the driver */ - int size =3D size_; + size_t size =3D size_; const uint8_t *dot1q_buf =3D NULL; =20 uint32_t packet_header =3D 0; @@ -826,7 +826,7 @@ static ssize_t rtl8139_do_receive(NetClientState *nc, c= onst uint8_t *buf, size_t static const uint8_t broadcast_macaddr[6] =3D { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; =20 - DPRINTF(">>> received len=3D%d\n", size); + DPRINTF(">>> received len=3D%zu\n", size); =20 /* test if board clock is stopped */ if (!s->clock_enabled) @@ -1035,7 +1035,7 @@ static ssize_t rtl8139_do_receive(NetClientState *nc,= const uint8_t *buf, size_t =20 if (size+4 > rx_space) { - DPRINTF("C+ Rx mode : descriptor %d size %d received %d + 4\n", + DPRINTF("C+ Rx mode : descriptor %d size %d received %zu + 4\n= ", descriptor, rx_space, size); =20 s->IntrStatus |=3D RxOverflow; @@ -1148,7 +1148,7 @@ static ssize_t rtl8139_do_receive(NetClientState *nc,= const uint8_t *buf, size_t if (avail !=3D 0 && RX_ALIGN(size + 8) >=3D avail) { DPRINTF("rx overflow: rx buffer length %d head 0x%04x " - "read 0x%04x =3D=3D=3D available 0x%04x need 0x%04x\n", + "read 0x%04x =3D=3D=3D available 0x%04x need 0x%04zx\n", s->RxBufferSize, s->RxBufAddr, s->RxBufPtr, avail, size + = 8); =20 s->IntrStatus |=3D RxOverflow; --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920570382208.84037073751142; Thu, 18 Oct 2018 20:42:50 -0700 (PDT) Received: from localhost ([::1]:46551 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLgP-0001zV-M8 for importer@patchew.org; Thu, 18 Oct 2018 23:42:41 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36940) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNy-00034P-R5 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNx-0003JX-Kg for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40760) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNx-0003Cc-8J; Thu, 18 Oct 2018 23:23:37 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AE69E3078A34; Fri, 19 Oct 2018 03:23:33 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id F3A0E3793; Fri, 19 Oct 2018 03:23:31 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:22 +0800 Message-Id: <1539919345-10703-24-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Fri, 19 Oct 2018 03:23:33 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 23/26] pcnet: fix possible buffer overflow X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , qemu-stable@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" In pcnet_receive(), we try to assign size_ to size which converts from size_t to integer. This will cause troubles when size_ is greater INT_MAX, this will lead a negative value in size and it can then pass the check of size < MIN_BUF_SIZE which may lead out of bound access for both buf and buf1. Fixing by converting the type of size to size_t. CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira Reviewed-by: Michael S. Tsirkin Signed-off-by: Jason Wang --- hw/net/pcnet.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/net/pcnet.c b/hw/net/pcnet.c index 0c44554..d9ba04b 100644 --- a/hw/net/pcnet.c +++ b/hw/net/pcnet.c @@ -988,14 +988,14 @@ ssize_t pcnet_receive(NetClientState *nc, const uint8= _t *buf, size_t size_) uint8_t buf1[60]; int remaining; int crc_err =3D 0; - int size =3D size_; + size_t size =3D size_; =20 if (CSR_DRX(s) || CSR_STOP(s) || CSR_SPND(s) || !size || (CSR_LOOP(s) && !s->looptest)) { return -1; } #ifdef PCNET_DEBUG - printf("pcnet_receive size=3D%d\n", size); + printf("pcnet_receive size=3D%zu\n", size); #endif =20 /* if too small buffer, then expand it */ --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920663498954.5804831271497; Thu, 18 Oct 2018 20:44:23 -0700 (PDT) Received: from localhost ([::1]:46557 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLi2-00045V-F4 for importer@patchew.org; Thu, 18 Oct 2018 23:44:22 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36947) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLNz-00034e-16 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNx-0003K1-Qh for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59994) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNx-0003FB-Bj; Thu, 18 Oct 2018 23:23:37 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E63991152D; Fri, 19 Oct 2018 03:23:35 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 38B395C69A; Fri, 19 Oct 2018 03:23:33 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:23 +0800 Message-Id: <1539919345-10703-25-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 19 Oct 2018 03:23:35 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 24/26] net: ignore packet size greater than INT_MAX X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , qemu-stable@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" There should not be a reason for passing a packet size greater than INT_MAX. It's usually a hint of bug somewhere, so ignore packet size greater than INT_MAX in qemu_deliver_packet_iov() CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira Reviewed-by: Michael S. Tsirkin Signed-off-by: Jason Wang --- net/net.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/net/net.c b/net/net.c index c66847e..07c194a 100644 --- a/net/net.c +++ b/net/net.c @@ -712,10 +712,15 @@ ssize_t qemu_deliver_packet_iov(NetClientState *sende= r, void *opaque) { NetClientState *nc =3D opaque; + size_t size =3D iov_size(iov, iovcnt); int ret; =20 + if (size > INT_MAX) { + return size; + } + if (nc->link_down) { - return iov_size(iov, iovcnt); + return size; } =20 if (nc->receive_disabled) { --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920765525570.2806392357986; Thu, 18 Oct 2018 20:46:05 -0700 (PDT) Received: from localhost ([::1]:46575 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLjg-0005ND-A5 for importer@patchew.org; Thu, 18 Oct 2018 23:46:04 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37043) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLO0-00036P-Uk for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLNz-0003Nf-Fe for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59750) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLNz-0003L8-1I for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:39 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2D52230034A1; Fri, 19 Oct 2018 03:23:38 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 717BD171B2; Fri, 19 Oct 2018 03:23:36 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:24 +0800 Message-Id: <1539919345-10703-26-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Fri, 19 Oct 2018 03:23:38 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 25/26] e1000: indicate dropped packets in HW counters X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , Martin Wilck Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The e1000 emulation silently discards RX packets if there's insufficient space in the ring buffer. This leads to errors on higher-level protocols in the guest, with no indication about the error cause. This patch increments the "Missed Packets Count" (MPC) and "Receive No Buffers Count" (RNBC) HW counters in this case. As the emulation has no FIFO for buffering packets that can't immediately be pushed to the guest, these two registers are practically equivalent (see 10.2.7.4, 10.2.7.33 in https://www.intel.com/content/www/us/en/embedded/products/networking/82574l= -gbe-controller-datasheet.html). On a Linux guest, the register content will be reflected in the "rx_missed_errors" and "rx_no_buffer_count" stats from "ethtool -S", and in the "missed" stat from "ip -s -s link show", giving at least some hint about the error cause inside the guest. If the cause is known, problems like this can often be avoided easily, by increasing the number of RX descriptors in the guest e1000 driver (e.g under Linux, "e1000.RxDescriptors=3D1024"). The patch also adds a qemu trace message for this condition. Signed-off-by: Martin Wilck Signed-off-by: Jason Wang --- hw/net/e1000.c | 16 +++++++++++++--- hw/net/trace-events | 3 +++ 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/hw/net/e1000.c b/hw/net/e1000.c index 13a9494..5e144cb 100644 --- a/hw/net/e1000.c +++ b/hw/net/e1000.c @@ -36,6 +36,7 @@ #include "qemu/range.h" =20 #include "e1000x_common.h" +#include "trace.h" =20 static const uint8_t bcast[] =3D {0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; =20 @@ -847,6 +848,15 @@ static uint64_t rx_desc_base(E1000State *s) return (bah << 32) + bal; } =20 +static void +e1000_receiver_overrun(E1000State *s, size_t size) +{ + trace_e1000_receiver_overrun(size, s->mac_reg[RDH], s->mac_reg[RDT]); + e1000x_inc_reg_if_not_full(s->mac_reg, RNBC); + e1000x_inc_reg_if_not_full(s->mac_reg, MPC); + set_ics(s, 0, E1000_ICS_RXO); +} + static ssize_t e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt) { @@ -916,8 +926,8 @@ e1000_receive_iov(NetClientState *nc, const struct iove= c *iov, int iovcnt) desc_offset =3D 0; total_size =3D size + e1000x_fcs_len(s->mac_reg); if (!e1000_has_rxbufs(s, total_size)) { - set_ics(s, 0, E1000_ICS_RXO); - return -1; + e1000_receiver_overrun(s, total_size); + return -1; } do { desc_size =3D total_size - desc_offset; @@ -969,7 +979,7 @@ e1000_receive_iov(NetClientState *nc, const struct iove= c *iov, int iovcnt) rdh_start >=3D s->mac_reg[RDLEN] / sizeof(desc)) { DBGOUT(RXERR, "RDH wraparound @%x, RDT %x, RDLEN %x\n", rdh_start, s->mac_reg[RDT], s->mac_reg[RDLEN]); - set_ics(s, 0, E1000_ICS_RXO); + e1000_receiver_overrun(s, total_size); return -1; } } while (desc_offset < total_size); diff --git a/hw/net/trace-events b/hw/net/trace-events index c1dea4b..9d49f62 100644 --- a/hw/net/trace-events +++ b/hw/net/trace-events @@ -98,6 +98,9 @@ net_rx_pkt_rss_ip6_ex(void) "Calculating IPv6/EX RSS has= h" net_rx_pkt_rss_hash(size_t rss_length, uint32_t rss_hash) "RSS hash for %z= u bytes: 0x%X" net_rx_pkt_rss_add_chunk(void* ptr, size_t size, size_t input_offset) "Add= RSS chunk %p, %zu bytes, RSS input offset %zu bytes" =20 +# hw/net/e1000.c +e1000_receiver_overrun(size_t s, uint32_t rdh, uint32_t rdt) "Receiver ove= rrun: dropped packet of %zu bytes, RDH=3D%u, RDT=3D%u" + # hw/net/e1000x_common.c e1000x_rx_can_recv_disabled(bool link_up, bool rx_enabled, bool pci_master= ) "link_up: %d, rx_enabled %d, pci_master %d" e1000x_vlan_is_vlan_pkt(bool is_vlan_pkt, uint16_t eth_proto, uint16_t vet= ) "Is VLAN packet: %d, ETH proto: 0x%X, VET: 0x%X" --=20 2.5.0 From nobody Thu May 16 07:55:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1539920058089945.291155241402; Thu, 18 Oct 2018 20:34:18 -0700 (PDT) Received: from localhost ([::1]:46504 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLYH-0003nz-0e for importer@patchew.org; Thu, 18 Oct 2018 23:34:17 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37117) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gDLO4-00038w-Ed for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gDLO1-0003Q5-F8 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38118) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gDLO1-0003PH-37 for qemu-devel@nongnu.org; Thu, 18 Oct 2018 23:23:41 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6A47788304; Fri, 19 Oct 2018 03:23:40 +0000 (UTC) Received: from jason-ThinkPad-T430s.redhat.com (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id AD7AB5C69A; Fri, 19 Oct 2018 03:23:38 +0000 (UTC) From: Jason Wang To: peter.maydell@linaro.org, qemu-devel@nongnu.org Date: Fri, 19 Oct 2018 11:22:25 +0800 Message-Id: <1539919345-10703-27-git-send-email-jasowang@redhat.com> In-Reply-To: <1539919345-10703-1-git-send-email-jasowang@redhat.com> References: <1539919345-10703-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Fri, 19 Oct 2018 03:23:40 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL V2 26/26] qemu-options: Fix bad "macaddr" property in the documentation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas Huth , Jason Wang Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Thomas Huth When using the "-device" option, the property is called "mac". "macaddr" is only used for the legacy "-net nic" option. Reported-by: Harald Hoyer Reviewed-by: Markus Armbruster Signed-off-by: Thomas Huth Signed-off-by: Jason Wang --- qemu-options.hx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qemu-options.hx b/qemu-options.hx index f139459..0f2f65e 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2256,7 +2256,7 @@ qemu-system-i386 linux.img \ -netdev socket,id=3Dn2,mcast=3D230.0.0.1:1234 # launch yet another QEMU instance on same "bus" qemu-system-i386 linux.img \ - -device e1000,netdev=3Dn3,macaddr=3D52:54:00:12:34:58 \ + -device e1000,netdev=3Dn3,mac=3D52:54:00:12:34:58 \ -netdev socket,id=3Dn3,mcast=3D230.0.0.1:1234 @end example =20 --=20 2.5.0