From nobody Thu Nov 13 16:32:42 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1582527364; cv=none; d=zohomail.com; s=zohoarc; b=mkGdTR9wmOWUXgFhsBUxVnl2swLm7drjWJ51WiwYRfCURn8ONPk/Xn4mc+LhGgW9C8zyHd4lv+riyKLu/cpUmkSyhPxqYEPU0v+2IkqDihBsYaWFvxvdZ+xlQnnhA2Yjm+mg1IgQO2dyJ/7NSPHqztF66Nl79lf7rF6vksvKEP4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582527364; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Z3xVkYisYyeq/TkmqMiKZZGLtdL0qSqzuFLjPOlisBo=; b=RqF6fHpn/ENebLXwYaFpoD38SZtpdsF3wn2/1VBOMUcJjaJN2IpixrsFjiivD57OmM0m84Qn/I1Dv/fL7Cq48u6I4Z1qon7u3AdrO5FtgZe1JXg4jPk3uRSASbF0jEXjpBptrJ7Qt00XCpocfV0KCXZGEBBW1erF1kwx5VFNPgQ= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1582527364522192.86752245952073; Sun, 23 Feb 2020 22:56:04 -0800 (PST) Received: from localhost ([::1]:60630 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67er-0004yX-W9 for importer@patchew.org; Mon, 24 Feb 2020 01:56:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:48115) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67dk-0003Fz-Hz for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j67dj-0006CM-Ft for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:52 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:52404 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j67dj-00064p-3o for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:51 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 192E5494E0556D372EFC; Mon, 24 Feb 2020 14:54:46 +0800 (CST) Received: from huawei.com (10.133.214.142) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Mon, 24 Feb 2020 14:54:38 +0800 From: zhanghailiang To: Subject: [PATCH V2 1/8] migration: fix COLO broken caused by a previous commit Date: Mon, 24 Feb 2020 14:54:07 +0800 Message-ID: <20200224065414.36524-2-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> References: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.133.214.142] X-CFilter-Loop: Reflected Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 45.249.212.32 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: danielcho@qnap.com, zhanghailiang , dgilbert@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" This commit "migration: Create migration_is_running()" broke COLO. Becuase there is a process broken by this commit. colo_process_checkpoint ->colo_do_checkpoint_transaction ->migrate_set_block_enabled ->qmp_migrate_set_capabilities It can be fixed by make COLO process as an exception, Maybe we need a better way to fix it. Cc: Juan Quintela Signed-off-by: zhanghailiang Reviewed-by: Juan Quintela --- migration/migration.c | 1 - 1 file changed, 1 deletion(-) diff --git a/migration/migration.c b/migration/migration.c index 8fb68795dc..06d1ff9d56 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -865,7 +865,6 @@ bool migration_is_running(int state) case MIGRATION_STATUS_DEVICE: case MIGRATION_STATUS_WAIT_UNPLUG: case MIGRATION_STATUS_CANCELLING: - case MIGRATION_STATUS_COLO: return true; =20 default: --=20 2.21.0 From nobody Thu Nov 13 16:32:42 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1582527488; cv=none; d=zohomail.com; s=zohoarc; b=kCEoKUdxR+ru75vH4TenfxThv6ui5Mm9GtRtKEeibsU5st8jPBQwhXd7cXfIUaSGUF3v6Zt7WgtCtyx0M4ICmPDCI/tiMABEy7gXAYX7j9Ija9/LpdWyNpdwZjhLQPILGJmXSpgkfEB8OapfCT3BTAnAWaIAwVQ6sn6YrmvHW8s= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582527488; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=XarwnckznXuitUd/jU1r/ON3MRR92qySQbAFGlbmCnA=; b=L+ZXTGVRtt4VTzs+MioqRKl3jDhY9097chaCONRSD/7iY+N8PMcjU/Qh2nMO8BvQfWvNXEzqjIhtqByuaZEjiqFEPluLRH4qiCBLQxny1S0NBMmClw+ouqkIkhjsorcJl2cjUuN81PLKysLF2y4+avggo1ClQLh8iZeyEUqTTfA= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 15825274883961.5256851703175016; Sun, 23 Feb 2020 22:58:08 -0800 (PST) Received: from localhost ([::1]:60656 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67gt-00081Y-Am for importer@patchew.org; Mon, 24 Feb 2020 01:58:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:48146) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67dl-0003G3-CA for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j67dj-0006Cs-Ob for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:53 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:52410 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j67dj-00064r-3p for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:51 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 0EB83B71BB992289D780; Mon, 24 Feb 2020 14:54:46 +0800 (CST) Received: from huawei.com (10.133.214.142) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Mon, 24 Feb 2020 14:54:38 +0800 From: zhanghailiang To: Subject: [PATCH V2 2/8] migration/colo: wrap incoming checkpoint process into new helper Date: Mon, 24 Feb 2020 14:54:08 +0800 Message-ID: <20200224065414.36524-3-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> References: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.133.214.142] X-CFilter-Loop: Reflected Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 45.249.212.32 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: danielcho@qnap.com, zhanghailiang , dgilbert@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" Split checkpoint incoming process into a helper. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 260 ++++++++++++++++++++++++----------------------- 1 file changed, 133 insertions(+), 127 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 2c88aa57a2..93c5a452fb 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -664,13 +664,138 @@ void migrate_start_colo_process(MigrationState *s) qemu_mutex_lock_iothread(); } =20 -static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request, - Error **errp) +static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, + QEMUFile *fb, QIOChannelBuffer *bioc, Error **errp) +{ + uint64_t total_size; + uint64_t value; + Error *local_err =3D NULL; + int ret; + + qemu_mutex_lock_iothread(); + vm_stop_force_state(RUN_STATE_COLO); + trace_colo_vm_state_change("run", "stop"); + qemu_mutex_unlock_iothread(); + + /* FIXME: This is unnecessary for periodic checkpoint mode */ + colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY, + &local_err); + if (local_err) { + error_propagate(errp, local_err); + return; + } + + colo_receive_check_message(mis->from_src_file, + COLO_MESSAGE_VMSTATE_SEND, &local_err); + if (local_err) { + error_propagate(errp, local_err); + return; + } + + qemu_mutex_lock_iothread(); + cpu_synchronize_all_pre_loadvm(); + ret =3D qemu_loadvm_state_main(mis->from_src_file, mis); + qemu_mutex_unlock_iothread(); + + if (ret < 0) { + error_setg(errp, "Load VM's live state (ram) error"); + return; + } + + value =3D colo_receive_message_value(mis->from_src_file, + COLO_MESSAGE_VMSTATE_SIZE, &local_err); + if (local_err) { + error_propagate(errp, local_err); + return; + } + + /* + * Read VM device state data into channel buffer, + * It's better to re-use the memory allocated. + * Here we need to handle the channel buffer directly. + */ + if (value > bioc->capacity) { + bioc->capacity =3D value; + bioc->data =3D g_realloc(bioc->data, bioc->capacity); + } + total_size =3D qemu_get_buffer(mis->from_src_file, bioc->data, value); + if (total_size !=3D value) { + error_setg(errp, "Got %" PRIu64 " VMState data, less than expected" + " %" PRIu64, total_size, value); + return; + } + bioc->usage =3D total_size; + qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL); + + colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_RECEIVED, + &local_err); + if (local_err) { + error_propagate(errp, local_err); + return; + } + + qemu_mutex_lock_iothread(); + vmstate_loading =3D true; + ret =3D qemu_load_device_state(fb); + if (ret < 0) { + error_setg(errp, "COLO: load device state failed"); + qemu_mutex_unlock_iothread(); + return; + } + +#ifdef CONFIG_REPLICATION + replication_get_error_all(&local_err); + if (local_err) { + error_propagate(errp, local_err); + qemu_mutex_unlock_iothread(); + return; + } + + /* discard colo disk buffer */ + replication_do_checkpoint_all(&local_err); + if (local_err) { + error_propagate(errp, local_err); + qemu_mutex_unlock_iothread(); + return; + } +#else + abort(); +#endif + /* Notify all filters of all NIC to do checkpoint */ + colo_notify_filters_event(COLO_EVENT_CHECKPOINT, &local_err); + + if (local_err) { + error_propagate(errp, local_err); + qemu_mutex_unlock_iothread(); + return; + } + + vmstate_loading =3D false; + vm_start(); + trace_colo_vm_state_change("stop", "run"); + qemu_mutex_unlock_iothread(); + + if (failover_get_state() =3D=3D FAILOVER_STATUS_RELAUNCH) { + failover_set_state(FAILOVER_STATUS_RELAUNCH, + FAILOVER_STATUS_NONE); + failover_request_active(NULL); + return; + } + + colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_LOADED, + &local_err); + if (local_err) { + error_propagate(errp, local_err); + } +} + +static void colo_wait_handle_message(MigrationIncomingState *mis, + QEMUFile *fb, QIOChannelBuffer *bioc, Error **errp) { COLOMessage msg; Error *local_err =3D NULL; =20 - msg =3D colo_receive_message(f, &local_err); + msg =3D colo_receive_message(mis->from_src_file, &local_err); if (local_err) { error_propagate(errp, local_err); return; @@ -678,10 +803,9 @@ static void colo_wait_handle_message(QEMUFile *f, int = *checkpoint_request, =20 switch (msg) { case COLO_MESSAGE_CHECKPOINT_REQUEST: - *checkpoint_request =3D 1; + colo_incoming_process_checkpoint(mis, fb, bioc, errp); break; default: - *checkpoint_request =3D 0; error_setg(errp, "Got unknown COLO message: %d", msg); break; } @@ -692,10 +816,7 @@ void *colo_process_incoming_thread(void *opaque) MigrationIncomingState *mis =3D opaque; QEMUFile *fb =3D NULL; QIOChannelBuffer *bioc =3D NULL; /* Cache incoming device state */ - uint64_t total_size; - uint64_t value; Error *local_err =3D NULL; - int ret; =20 rcu_register_thread(); qemu_sem_init(&mis->colo_incoming_sem, 0); @@ -749,134 +870,19 @@ void *colo_process_incoming_thread(void *opaque) } =20 while (mis->state =3D=3D MIGRATION_STATUS_COLO) { - int request =3D 0; - - colo_wait_handle_message(mis->from_src_file, &request, &local_err); + colo_wait_handle_message(mis, fb, bioc, &local_err); if (local_err) { - goto out; + error_report_err(local_err); + break; } - assert(request); if (failover_get_state() !=3D FAILOVER_STATUS_NONE) { error_report("failover request"); - goto out; - } - - qemu_mutex_lock_iothread(); - vm_stop_force_state(RUN_STATE_COLO); - trace_colo_vm_state_change("run", "stop"); - qemu_mutex_unlock_iothread(); - - /* FIXME: This is unnecessary for periodic checkpoint mode */ - colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY, - &local_err); - if (local_err) { - goto out; - } - - colo_receive_check_message(mis->from_src_file, - COLO_MESSAGE_VMSTATE_SEND, &local_err); - if (local_err) { - goto out; - } - - qemu_mutex_lock_iothread(); - cpu_synchronize_all_pre_loadvm(); - ret =3D qemu_loadvm_state_main(mis->from_src_file, mis); - qemu_mutex_unlock_iothread(); - - if (ret < 0) { - error_report("Load VM's live state (ram) error"); - goto out; - } - - value =3D colo_receive_message_value(mis->from_src_file, - COLO_MESSAGE_VMSTATE_SIZE, &local_err); - if (local_err) { - goto out; - } - - /* - * Read VM device state data into channel buffer, - * It's better to re-use the memory allocated. - * Here we need to handle the channel buffer directly. - */ - if (value > bioc->capacity) { - bioc->capacity =3D value; - bioc->data =3D g_realloc(bioc->data, bioc->capacity); - } - total_size =3D qemu_get_buffer(mis->from_src_file, bioc->data, val= ue); - if (total_size !=3D value) { - error_report("Got %" PRIu64 " VMState data, less than expected" - " %" PRIu64, total_size, value); - goto out; - } - bioc->usage =3D total_size; - qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL); - - colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_RECEIVED, - &local_err); - if (local_err) { - goto out; - } - - qemu_mutex_lock_iothread(); - vmstate_loading =3D true; - ret =3D qemu_load_device_state(fb); - if (ret < 0) { - error_report("COLO: load device state failed"); - qemu_mutex_unlock_iothread(); - goto out; - } - -#ifdef CONFIG_REPLICATION - replication_get_error_all(&local_err); - if (local_err) { - qemu_mutex_unlock_iothread(); - goto out; - } - - /* discard colo disk buffer */ - replication_do_checkpoint_all(&local_err); - if (local_err) { - qemu_mutex_unlock_iothread(); - goto out; - } -#else - abort(); -#endif - /* Notify all filters of all NIC to do checkpoint */ - colo_notify_filters_event(COLO_EVENT_CHECKPOINT, &local_err); - - if (local_err) { - qemu_mutex_unlock_iothread(); - goto out; - } - - vmstate_loading =3D false; - vm_start(); - trace_colo_vm_state_change("stop", "run"); - qemu_mutex_unlock_iothread(); - - if (failover_get_state() =3D=3D FAILOVER_STATUS_RELAUNCH) { - failover_set_state(FAILOVER_STATUS_RELAUNCH, - FAILOVER_STATUS_NONE); - failover_request_active(NULL); - goto out; - } - - colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_LOADED, - &local_err); - if (local_err) { - goto out; + break; } } =20 out: vmstate_loading =3D false; - /* Throw the unreported error message after exited from loop */ - if (local_err) { - error_report_err(local_err); - } =20 /* * There are only two reasons we can get here, some error happened --=20 2.21.0 From nobody Thu Nov 13 16:32:42 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1582527486; cv=none; d=zohomail.com; s=zohoarc; b=P8Xq1bnu4dxULnoReXeWRDsLtU0AD1fzhFLiBBsNM6+zpBpB5AiRy17t4M+Ex9Hb/3ZtUIVeKmdwWSjpVJ/n1Mv5vZQCtJj4OfYWOMRN2QgERGu/VavMQVJprZbLc4kNmz7FzCOpmVUTHT9xgXdmpGpweydqN6rvQDGJNcSOOkY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582527486; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=l/jzQH4MPsyk0PwMJi2sD/WdGQ1LBh5RTIKHc45MRMM=; b=MXH0ckOXYLUixxsZ6TsQpD2ULqrEAgPKWhLkizNdCGn9FtuMhrisWamMx8nidJm4EhEKMRb0Ia/TZDkW4mUgmDp+WfjaKAdSPGRxkyxmZcyG30Nhisde8m2ssZzZldh0ouM1qwwCgUFHn/jy+KjFp4o3VKby7awa8bthA+PH5sg= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1582527486120823.5948090936905; Sun, 23 Feb 2020 22:58:06 -0800 (PST) Received: from localhost ([::1]:60654 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67gq-0007vv-Rp for importer@patchew.org; Mon, 24 Feb 2020 01:58:04 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:48116) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67dk-0003G0-Ik for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j67dj-0006CR-Fz for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:52 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:40848 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j67dj-00064s-3p for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:51 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 402B59380F0EC1C1E944; Mon, 24 Feb 2020 14:54:46 +0800 (CST) Received: from huawei.com (10.133.214.142) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Mon, 24 Feb 2020 14:54:39 +0800 From: zhanghailiang To: Subject: [PATCH V2 3/8] savevm: Don't call colo_init_ram_cache twice Date: Mon, 24 Feb 2020 14:54:09 +0800 Message-ID: <20200224065414.36524-4-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> References: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.133.214.142] X-CFilter-Loop: Reflected Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 45.249.212.35 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: danielcho@qnap.com, zhanghailiang , dgilbert@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" This helper has been called twice which is wrong. Left the one where called while get COLO enable message from source side. Signed-off-by: zhanghailiang Reviewed-by: Juan Quintela --- migration/migration.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 06d1ff9d56..e8c62c6e2e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -484,11 +484,6 @@ static void process_incoming_migration_co(void *opaque) goto fail; } =20 - if (colo_init_ram_cache() < 0) { - error_report("Init ram cache failed"); - goto fail; - } - qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming", colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE); mis->have_colo_incoming_thread =3D true; --=20 2.21.0 From nobody Thu Nov 13 16:32:42 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1582527490; cv=none; d=zohomail.com; s=zohoarc; b=n8ByAAsCqOv3DwjBBCQ0M7ORc1YpXl9vq9nxSjrO7flMek93lVeA2r/UDGSxTxwjpU3WlHM+NjE01yi9WXY/RXkUIImTjWKMF5gultWoyO7BusSGOmYlrPD3Mx5c8ocay53RRAhNHKgUDmE+xw0vKp/t1h7KGqNQMa69bQ7sFkk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582527490; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=xGLGj5YunX1yAXxONjO82MbYox9iXcqzs6PhN+KdFrc=; b=F+4cKM7AlpWQlpciXx8hsSgcEQyvgu5xQbDaj8w8PvEC53dRNtozf8HXaUfR4thSAbBB0BeoyXxrwBvCKtF1jGPnsnozMZLEqjwP/Ft4saaiXVMziwyiGfc3iJ4SGJv8anHae+LEQaghuIcL9qCPGJTXnwa/kXNnPrE7yyFJBP0= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1582527490154583.6242230560342; Sun, 23 Feb 2020 22:58:10 -0800 (PST) Received: from localhost ([::1]:60658 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67gv-000874-6M for importer@patchew.org; Mon, 24 Feb 2020 01:58:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:48148) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67dl-0003G6-FY for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j67dk-0006D2-4G for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:53 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:2728 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j67dj-00064o-Ew for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:52 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 36AD710ACAAA30F9F5F0; Mon, 24 Feb 2020 14:54:46 +0800 (CST) Received: from huawei.com (10.133.214.142) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Mon, 24 Feb 2020 14:54:39 +0800 From: zhanghailiang To: Subject: [PATCH V2 4/8] COLO: Optimize memory back-up process Date: Mon, 24 Feb 2020 14:54:10 +0800 Message-ID: <20200224065414.36524-5-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> References: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.133.214.142] X-CFilter-Loop: Reflected Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 45.249.212.190 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: danielcho@qnap.com, zhanghailiang , dgilbert@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" This patch will reduce the downtime of VM for the initial process, Privously, we copied all these memory in preparing stage of COLO while we need to stop VM, which is a time-consuming process. Here we optimize it by a trick, back-up every page while in migration process while COLO is enabled, though it affects the speed of the migration, but it obviously reduce the downtime of back-up all SVM'S memory in COLO preparing stage. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 3 +++ migration/ram.c | 68 +++++++++++++++++++++++++++++++++++------------- migration/ram.h | 1 + 3 files changed, 54 insertions(+), 18 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 93c5a452fb..44942c4e23 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -26,6 +26,7 @@ #include "qemu/main-loop.h" #include "qemu/rcu.h" #include "migration/failover.h" +#include "migration/ram.h" #ifdef CONFIG_REPLICATION #include "replication.h" #endif @@ -845,6 +846,8 @@ void *colo_process_incoming_thread(void *opaque) */ qemu_file_set_blocking(mis->from_src_file, true); =20 + colo_incoming_start_dirty_log(); + bioc =3D qio_channel_buffer_new(COLO_BUFFER_BASE_SIZE); fb =3D qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); diff --git a/migration/ram.c b/migration/ram.c index ed23ed1c7c..ebf9e6ba51 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2277,6 +2277,7 @@ static void ram_list_init_bitmaps(void) * dirty_memory[DIRTY_MEMORY_MIGRATION] don't include the whole * guest memory. */ + block->bmap =3D bitmap_new(pages); bitmap_set(block->bmap, 0, pages); block->clear_bmap_shift =3D shift; @@ -2986,7 +2987,6 @@ int colo_init_ram_cache(void) } return -errno; } - memcpy(block->colo_cache, block->host, block->used_length); } } =20 @@ -3000,19 +3000,36 @@ int colo_init_ram_cache(void) =20 RAMBLOCK_FOREACH_NOT_IGNORED(block) { unsigned long pages =3D block->max_length >> TARGET_PAGE_BITS; - block->bmap =3D bitmap_new(pages); - bitmap_set(block->bmap, 0, pages); } } - ram_state =3D g_new0(RAMState, 1); - ram_state->migration_dirty_pages =3D 0; - qemu_mutex_init(&ram_state->bitmap_mutex); - memory_global_dirty_log_start(); =20 + ram_state_init(&ram_state); return 0; } =20 +/* TODO: duplicated with ram_init_bitmaps */ +void colo_incoming_start_dirty_log(void) +{ + RAMBlock *block =3D NULL; + /* For memory_global_dirty_log_start below. */ + qemu_mutex_lock_iothread(); + qemu_mutex_lock_ramlist(); + + memory_global_dirty_log_sync(); + WITH_RCU_READ_LOCK_GUARD() { + RAMBLOCK_FOREACH_NOT_IGNORED(block) { + ramblock_sync_dirty_bitmap(ram_state, block); + /* Discard this dirty bitmap record */ + bitmap_zero(block->bmap, block->max_length >> TARGET_PAGE_BITS= ); + } + memory_global_dirty_log_start(); + } + ram_state->migration_dirty_pages =3D 0; + qemu_mutex_unlock_ramlist(); + qemu_mutex_unlock_iothread(); +} + /* It is need to hold the global lock to call this helper */ void colo_release_ram_cache(void) { @@ -3032,9 +3049,7 @@ void colo_release_ram_cache(void) } } } - qemu_mutex_destroy(&ram_state->bitmap_mutex); - g_free(ram_state); - ram_state =3D NULL; + ram_state_cleanup(&ram_state); } =20 /** @@ -3302,7 +3317,6 @@ static void colo_flush_ram_cache(void) ramblock_sync_dirty_bitmap(ram_state, block); } } - trace_colo_flush_ram_cache_begin(ram_state->migration_dirty_pages); WITH_RCU_READ_LOCK_GUARD() { block =3D QLIST_FIRST_RCU(&ram_list.blocks); @@ -3348,7 +3362,7 @@ static int ram_load_precopy(QEMUFile *f) =20 while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) { ram_addr_t addr, total_ram_bytes; - void *host =3D NULL; + void *host =3D NULL, *host_bak =3D NULL; uint8_t ch; =20 /* @@ -3379,20 +3393,35 @@ static int ram_load_precopy(QEMUFile *f) RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) { RAMBlock *block =3D ram_block_from_stream(f, flags); =20 + host =3D host_from_ram_block_offset(block, addr); /* - * After going into COLO, we should load the Page into colo_ca= che. + * After going into COLO stage, we should not load the page + * into SVM's memory diretly, we put them into colo_cache firs= tly. + * NOTE: We need to keep a copy of SVM's ram in colo_cache. + * Privously, we copied all these memory in preparing stage of= COLO + * while we need to stop VM, which is a time-consuming process. + * Here we optimize it by a trick, back-up every page while in + * migration process while COLO is enabled, though it affects = the + * speed of the migration, but it obviously reduce the downtim= e of + * back-up all SVM'S memory in COLO preparing stage. */ - if (migration_incoming_in_colo_state()) { - host =3D colo_cache_from_block_offset(block, addr); - } else { - host =3D host_from_ram_block_offset(block, addr); + if (migration_incoming_colo_enabled()) { + if (migration_incoming_in_colo_state()) { + /* In COLO stage, put all pages into cache temporarily= */ + host =3D colo_cache_from_block_offset(block, addr); + } else { + /* + * In migration stage but before COLO stage, + * Put all pages into both cache and SVM's memory. + */ + host_bak =3D colo_cache_from_block_offset(block, addr); + } } if (!host) { error_report("Illegal RAM offset " RAM_ADDR_FMT, addr); ret =3D -EINVAL; break; } - if (!migration_incoming_in_colo_state()) { ramblock_recv_bitmap_set(block, host); } @@ -3506,6 +3535,9 @@ static int ram_load_precopy(QEMUFile *f) if (!ret) { ret =3D qemu_file_get_error(f); } + if (!ret && host_bak) { + memcpy(host_bak, host, TARGET_PAGE_SIZE); + } } =20 ret |=3D wait_for_decompress_done(); diff --git a/migration/ram.h b/migration/ram.h index a553d40751..5ceaff7cb4 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -66,5 +66,6 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *= rb); /* ram cache */ int colo_init_ram_cache(void); void colo_release_ram_cache(void); +void colo_incoming_start_dirty_log(void); =20 #endif --=20 2.21.0 From nobody Thu Nov 13 16:32:42 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1582527364; cv=none; d=zohomail.com; s=zohoarc; b=fDD131f/LaHN5VP7O4foFY09Szd9WIBRqFyVhItlAZHRJjif1BM/TCPJZC08UiS81HaZNXfNPg7CY7CUe9/bNdoiG5s2l0U43mL/on543liLnJnG58vrr8ziR69YXBYnn7LEZm4Z5KPcce5nVmeBgg45ZKQD8hWg8VNmrY5Xi/0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582527364; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=8qOBXmAnBJ+gMVMCMcP9ammcr+5fCQipzwkcCpVmDPE=; b=T2O3KQtZIz1U5MauvdLR6Se0+34ARheQc//P2uLrbifLXK8juKPKE/RVEFkob6LWAxN+ZsmUR3/5hLuhhSzeXA2LLDOOX6vnnSOROltCJzxxbM+McJvmhP0nDdyKF6aXVD07VLyQRTOxTB/2ype59qF5dV1w/klq8iW8n/fjYrk= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1582527364597915.7204753840052; Sun, 23 Feb 2020 22:56:04 -0800 (PST) Received: from localhost ([::1]:60632 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67er-0004zE-Ki for importer@patchew.org; Mon, 24 Feb 2020 01:56:01 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:48113) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67dk-0003Fx-Hr for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j67dj-0006CC-D3 for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:52 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:52408 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j67dj-00064q-1H for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:51 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 1485A79627075DC5B3B5; Mon, 24 Feb 2020 14:54:46 +0800 (CST) Received: from huawei.com (10.133.214.142) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Mon, 24 Feb 2020 14:54:40 +0800 From: zhanghailiang To: Subject: [PATCH V2 5/8] ram/colo: only record bitmap of dirty pages in COLO stage Date: Mon, 24 Feb 2020 14:54:11 +0800 Message-ID: <20200224065414.36524-6-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> References: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.133.214.142] X-CFilter-Loop: Reflected Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 45.249.212.32 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: danielcho@qnap.com, zhanghailiang , dgilbert@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" It is only need to record bitmap of dirty pages while goes into COLO stage. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index ebf9e6ba51..1b3f423351 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2735,7 +2735,7 @@ static inline void *host_from_ram_block_offset(RAMBlo= ck *block, } =20 static inline void *colo_cache_from_block_offset(RAMBlock *block, - ram_addr_t offset) + ram_addr_t offset, bool record_bitmap) { if (!offset_in_ramblock(block, offset)) { return NULL; @@ -2751,7 +2751,8 @@ static inline void *colo_cache_from_block_offset(RAMB= lock *block, * It help us to decide which pages in ram cache should be flushed * into VM's RAM later. */ - if (!test_and_set_bit(offset >> TARGET_PAGE_BITS, block->bmap)) { + if (record_bitmap && + !test_and_set_bit(offset >> TARGET_PAGE_BITS, block->bmap)) { ram_state->migration_dirty_pages++; } return block->colo_cache + offset; @@ -3408,13 +3409,13 @@ static int ram_load_precopy(QEMUFile *f) if (migration_incoming_colo_enabled()) { if (migration_incoming_in_colo_state()) { /* In COLO stage, put all pages into cache temporarily= */ - host =3D colo_cache_from_block_offset(block, addr); + host =3D colo_cache_from_block_offset(block, addr, tru= e); } else { /* * In migration stage but before COLO stage, * Put all pages into both cache and SVM's memory. */ - host_bak =3D colo_cache_from_block_offset(block, addr); + host_bak =3D colo_cache_from_block_offset(block, addr,= false); } } if (!host) { --=20 2.21.0 From nobody Thu Nov 13 16:32:42 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1582527366; cv=none; d=zohomail.com; s=zohoarc; b=GWLNqAXnuuCKj08qCpJ0oI/ae2fHt6c9l/GG0vBoKSHgBPATu7rRr5SrGwrgdjV5jg+HLLKgS+aIO3p1Hrmdq16r6tCnIDwpOqPx35kXHEEupXo6613KQAv9yokp5p1APC8Ga3rK+Z6Ef9D4PrGTCOd8aecj1ar4GYic9TuZX7o= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582527366; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Enj9nnM9WL8Nyz9HpvbhS0GFYxmpOuoIg+xl/aHce/w=; b=gW/zMIjydE7hUWP2Yx+I84r+26yWMYAam3pY3o7DIH9AmIBhOupX53G8lHcFkhy/PBjmkQ81fy58VTsmuFj03jQ4SAd5ukQ9USiwpZzWMveHcLr95HwF7hKTvFqjKPagrHSNhDrEtG7Ry4rt9oKorLRaZqIzfnIbbo7EBVBOuAE= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1582527366271772.026248186764; Sun, 23 Feb 2020 22:56:06 -0800 (PST) Received: from localhost ([::1]:60636 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67eu-00055l-JS for importer@patchew.org; Mon, 24 Feb 2020 01:56:04 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:48166) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67dm-0003GS-IG for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j67dl-0006E6-LR for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:54 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:40950 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j67dl-0006Cd-9F for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:53 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 207AEBF394EC29AE2C13; Mon, 24 Feb 2020 14:54:51 +0800 (CST) Received: from huawei.com (10.133.214.142) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Mon, 24 Feb 2020 14:54:40 +0800 From: zhanghailiang To: Subject: [PATCH V2 6/8] migration: recognize COLO as part of activating process Date: Mon, 24 Feb 2020 14:54:12 +0800 Message-ID: <20200224065414.36524-7-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> References: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.133.214.142] X-CFilter-Loop: Reflected Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 45.249.212.35 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: danielcho@qnap.com, zhanghailiang , dgilbert@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" We will migrate parts of dirty pages backgroud lively during the gap time of two checkpoints, without this modification, it will not work because ram_save_iterate() will check it before send RAM_SAVE_FLAG_EOS at the end of it. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/migration.c | 1 + 1 file changed, 1 insertion(+) diff --git a/migration/migration.c b/migration/migration.c index e8c62c6e2e..f71c337600 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -840,6 +840,7 @@ bool migration_is_setup_or_active(int state) case MIGRATION_STATUS_PRE_SWITCHOVER: case MIGRATION_STATUS_DEVICE: case MIGRATION_STATUS_WAIT_UNPLUG: + case MIGRATION_STATUS_COLO: return true; =20 default: --=20 2.21.0 From nobody Thu Nov 13 16:32:42 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1582527610; cv=none; d=zohomail.com; s=zohoarc; b=CSnJiJSE+7n+c/mgXKHlvvScIMEsiFnCaDeCR3qe4e33kkRUDQjXFD9iEJvBV992UTNPpH6Bz9SW2/6l2G8vcCkf113gXLMCsZzFT2VGLpGrjYmISKVraJM1I98xkisCxMuAJoEZMw/9Cz9B6tAbBjXVe/6+pV18H89SHLGek6s= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582527610; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=utACNaqhqg9rq5e4HHgZnoChZXPG9EbATK1LhttDWPg=; b=MpOwgnYBCLzXPvY+dMVu5kQ3GtESIvUWGJsoVG+2WDXwkvFtSfuwJX+NtjvCIk3Mbgo1fNnZTL6DLwU0DkswKS4Uyt2IlOH2U8aARkcxmqmA6q5/gEdleLRcHMSWiNZx9CqwpJNH9zJv9K52ZUI6VPx//IV7Mdt69D/goPUMIsA= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1582527610141343.3174909402642; Sun, 23 Feb 2020 23:00:10 -0800 (PST) Received: from localhost ([::1]:60672 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67iq-0001rL-TS for importer@patchew.org; Mon, 24 Feb 2020 02:00:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:48184) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67dn-0003Hb-9G for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j67dl-0006ES-Tv for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:55 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:40948 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j67dl-0006Ce-9f for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:53 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 1A7A07E78B65621728BD; Mon, 24 Feb 2020 14:54:51 +0800 (CST) Received: from huawei.com (10.133.214.142) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Mon, 24 Feb 2020 14:54:41 +0800 From: zhanghailiang To: Subject: [PATCH V2 7/8] COLO: Migrate dirty pages during the gap of checkpointing Date: Mon, 24 Feb 2020 14:54:13 +0800 Message-ID: <20200224065414.36524-8-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> References: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.133.214.142] X-CFilter-Loop: Reflected Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 45.249.212.35 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: danielcho@qnap.com, zhanghailiang , dgilbert@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" We can migrate some dirty pages during the gap of checkpointing, by this way, we can reduce the amount of ram migrated during checkpointing. Signed-off-by: zhanghailiang --- migration/colo.c | 73 ++++++++++++++++++++++++++++++++++++++++-- migration/migration.h | 1 + migration/trace-events | 1 + qapi/migration.json | 4 ++- 4 files changed, 75 insertions(+), 4 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 44942c4e23..c36d94072f 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -47,6 +47,13 @@ static COLOMode last_colo_mode; =20 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024) =20 +#define DEFAULT_RAM_PENDING_CHECK 1000 + +/* should be calculated by bandwidth and max downtime ? */ +#define THRESHOLD_PENDING_SIZE (100 * 1024 * 1024UL) + +static int checkpoint_request; + bool migration_in_colo_state(void) { MigrationState *s =3D migrate_get_current(); @@ -517,6 +524,20 @@ static void colo_compare_notify_checkpoint(Notifier *n= otifier, void *data) colo_checkpoint_notify(data); } =20 +static bool colo_need_migrate_ram_background(MigrationState *s) +{ + uint64_t pending_size, pend_pre, pend_compat, pend_post; + int64_t max_size =3D THRESHOLD_PENDING_SIZE; + + qemu_savevm_state_pending(s->to_dst_file, max_size, &pend_pre, + &pend_compat, &pend_post); + pending_size =3D pend_pre + pend_compat + pend_post; + + trace_colo_need_migrate_ram_background(pending_size); + return (pending_size >=3D max_size); +} + + static void colo_process_checkpoint(MigrationState *s) { QIOChannelBuffer *bioc; @@ -572,6 +593,8 @@ static void colo_process_checkpoint(MigrationState *s) =20 timer_mod(s->colo_delay_timer, current_time + s->parameters.x_checkpoint_delay); + timer_mod(s->pending_ram_check_timer, + current_time + DEFAULT_RAM_PENDING_CHECK); =20 while (s->state =3D=3D MIGRATION_STATUS_COLO) { if (failover_get_state() !=3D FAILOVER_STATUS_NONE) { @@ -584,9 +607,30 @@ static void colo_process_checkpoint(MigrationState *s) if (s->state !=3D MIGRATION_STATUS_COLO) { goto out; } - ret =3D colo_do_checkpoint_transaction(s, bioc, fb); - if (ret < 0) { - goto out; + if (atomic_xchg(&checkpoint_request, 0)) { + /* start a colo checkpoint */ + ret =3D colo_do_checkpoint_transaction(s, bioc, fb); + if (ret < 0) { + goto out; + } + } else { + if (colo_need_migrate_ram_background(s)) { + colo_send_message(s->to_dst_file, + COLO_MESSAGE_MIGRATE_RAM_BACKGROUND, + &local_err); + if (local_err) { + goto out; + } + + qemu_savevm_state_iterate(s->to_dst_file, false); + qemu_put_byte(s->to_dst_file, QEMU_VM_EOF); + ret =3D qemu_file_get_error(s->to_dst_file); + if (ret < 0) { + error_setg_errno(&local_err, -ret, + "Failed to send dirty pages backgroud"); + goto out; + } + } } } =20 @@ -627,6 +671,8 @@ out: colo_compare_unregister_notifier(&packets_compare_notifier); timer_del(s->colo_delay_timer); timer_free(s->colo_delay_timer); + timer_del(s->pending_ram_check_timer); + timer_free(s->pending_ram_check_timer); qemu_sem_destroy(&s->colo_checkpoint_sem); =20 /* @@ -644,6 +690,7 @@ void colo_checkpoint_notify(void *opaque) MigrationState *s =3D opaque; int64_t next_notify_time; =20 + atomic_inc(&checkpoint_request); qemu_sem_post(&s->colo_checkpoint_sem); s->colo_checkpoint_time =3D qemu_clock_get_ms(QEMU_CLOCK_HOST); next_notify_time =3D s->colo_checkpoint_time + @@ -651,6 +698,19 @@ void colo_checkpoint_notify(void *opaque) timer_mod(s->colo_delay_timer, next_notify_time); } =20 +static void colo_pending_ram_check_notify(void *opaque) +{ + int64_t next_notify_time; + MigrationState *s =3D opaque; + + if (migration_in_colo_state()) { + next_notify_time =3D DEFAULT_RAM_PENDING_CHECK + + qemu_clock_get_ms(QEMU_CLOCK_HOST); + timer_mod(s->pending_ram_check_timer, next_notify_time); + qemu_sem_post(&s->colo_checkpoint_sem); + } +} + void migrate_start_colo_process(MigrationState *s) { qemu_mutex_unlock_iothread(); @@ -658,6 +718,8 @@ void migrate_start_colo_process(MigrationState *s) s->colo_delay_timer =3D timer_new_ms(QEMU_CLOCK_HOST, colo_checkpoint_notify, s); =20 + s->pending_ram_check_timer =3D timer_new_ms(QEMU_CLOCK_HOST, + colo_pending_ram_check_notify, s); qemu_sem_init(&s->colo_exit_sem, 0); migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_COLO); @@ -806,6 +868,11 @@ static void colo_wait_handle_message(MigrationIncoming= State *mis, case COLO_MESSAGE_CHECKPOINT_REQUEST: colo_incoming_process_checkpoint(mis, fb, bioc, errp); break; + case COLO_MESSAGE_MIGRATE_RAM_BACKGROUND: + if (qemu_loadvm_state_main(mis->from_src_file, mis) < 0) { + error_setg(errp, "Load ram background failed"); + } + break; default: error_setg(errp, "Got unknown COLO message: %d", msg); break; diff --git a/migration/migration.h b/migration/migration.h index 8473ddfc88..5355259789 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -219,6 +219,7 @@ struct MigrationState QemuSemaphore colo_checkpoint_sem; int64_t colo_checkpoint_time; QEMUTimer *colo_delay_timer; + QEMUTimer *pending_ram_check_timer; =20 /* The first error that has occurred. We used the mutex to be able to return the 1st error message */ diff --git a/migration/trace-events b/migration/trace-events index 4ab0a503d2..f2ed0c8645 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -295,6 +295,7 @@ migration_tls_incoming_handshake_complete(void) "" colo_vm_state_change(const char *old, const char *new) "Change '%s' =3D> '= %s'" colo_send_message(const char *msg) "Send '%s' message" colo_receive_message(const char *msg) "Receive '%s' message" +colo_need_migrate_ram_background(uint64_t pending_size) "Pending 0x%" PRIx= 64 " dirty ram" =20 # colo-failover.c colo_failover_set_state(const char *new_state) "new state %s" diff --git a/qapi/migration.json b/qapi/migration.json index 52f3429969..73445f1978 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -977,12 +977,14 @@ # # @vmstate-loaded: VM's state has been loaded by SVM. # +# @migrate-ram-background: Send some dirty pages during the gap of COLO ch= eckpoint +# # Since: 2.8 ## { 'enum': 'COLOMessage', 'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply', 'vmstate-send', 'vmstate-size', 'vmstate-received', - 'vmstate-loaded' ] } + 'vmstate-loaded', 'migrate-ram-background' ] } =20 ## # @COLOMode: --=20 2.21.0 From nobody Thu Nov 13 16:32:42 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1582527607; cv=none; d=zohomail.com; s=zohoarc; b=gKqUU1iuVWIzg1dHH6EeYYWPR2XBF3lsPWOuJPo3frD1baL9hqy3HZXcEmLisrYtN726o0ZGwG0ub8xJgN9I9woaTfRY13hoo8AZ2dNl/cph+177jbWH+e+l5ef7BN48Id4gR+jvV2wCKQ0843Ffuw/tcQo4kmi5zoHNAKdza6A= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582527607; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=J6xPRE6q5U6vDsgVJzVMn334ClJQ0s4VwLCdf9SNeSg=; b=dYJFU2PicUtJDXvvZtqjHM/VMGWQQCxrSYq9WXBMEcAKUqC0YU4LzV1pRG6SgEFeyheZpnJXLUUs6POLCKl4mV6tQlbd3iIBiINCweoahjm+6EnghnyZIigkKN98h4e9aSiktX/FySSCCSerK70/9WYPQEWWTOtjWUSIt1dV+p4= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1582527607730459.37126291667573; Sun, 23 Feb 2020 23:00:07 -0800 (PST) Received: from localhost ([::1]:60670 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67io-0001mf-E1 for importer@patchew.org; Mon, 24 Feb 2020 02:00:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:48172) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j67dm-0003GY-P4 for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j67dl-0006EE-P0 for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:54 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:52524 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j67dl-0006Co-DN for qemu-devel@nongnu.org; Mon, 24 Feb 2020 01:54:53 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 4E4FD83CDB1972F96EFE; Mon, 24 Feb 2020 14:54:51 +0800 (CST) Received: from huawei.com (10.133.214.142) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Mon, 24 Feb 2020 14:54:42 +0800 From: zhanghailiang To: Subject: [PATCH V2 8/8] migration/colo: Only flush ram cache while do checkpoint Date: Mon, 24 Feb 2020 14:54:14 +0800 Message-ID: <20200224065414.36524-9-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> References: <20200224065414.36524-1-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.133.214.142] X-CFilter-Loop: Reflected Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 45.249.212.32 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: danielcho@qnap.com, zhanghailiang , dgilbert@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" After add migrating ram backgroud, we will call ram_load for this process, but we should not flush ram cache during this process. Move the flush action to the right place. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 1 + migration/ram.c | 5 +---- migration/ram.h | 1 + 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index c36d94072f..18df8289f8 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -799,6 +799,7 @@ static void colo_incoming_process_checkpoint(MigrationI= ncomingState *mis, =20 qemu_mutex_lock_iothread(); vmstate_loading =3D true; + colo_flush_ram_cache(); ret =3D qemu_load_device_state(fb); if (ret < 0) { error_setg(errp, "COLO: load device state failed"); diff --git a/migration/ram.c b/migration/ram.c index 1b3f423351..7bc841d14f 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3305,7 +3305,7 @@ static bool postcopy_is_running(void) * Flush content of RAM cache into SVM's memory. * Only flush the pages that be dirtied by PVM or SVM or both. */ -static void colo_flush_ram_cache(void) +void colo_flush_ram_cache(void) { RAMBlock *block =3D NULL; void *dst_host; @@ -3576,9 +3576,6 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) } trace_ram_load_complete(ret, seq_iter); =20 - if (!ret && migration_incoming_in_colo_state()) { - colo_flush_ram_cache(); - } return ret; } =20 diff --git a/migration/ram.h b/migration/ram.h index 5ceaff7cb4..ae14341482 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -67,5 +67,6 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *= rb); int colo_init_ram_cache(void); void colo_release_ram_cache(void); void colo_incoming_start_dirty_log(void); +void colo_flush_ram_cache(void); =20 #endif --=20 2.21.0