From nobody Mon Feb 9 20:10:33 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1529537956143649.6733897470895; Wed, 20 Jun 2018 16:39:16 -0700 (PDT) Received: from localhost ([::1]:52199 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVmgr-00009O-5W for importer@patchew.org; Wed, 20 Jun 2018 19:39:05 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48435) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVmXF-0001ds-EA for qemu-devel@nongnu.org; Wed, 20 Jun 2018 19:29:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVmXD-0007FW-RS for qemu-devel@nongnu.org; Wed, 20 Jun 2018 19:29:09 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:53760 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fVmXD-0007FI-LZ for qemu-devel@nongnu.org; Wed, 20 Jun 2018 19:29:07 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 477A6859C5 for ; Wed, 20 Jun 2018 23:29:07 +0000 (UTC) Received: from secure.mitica (ovpn-116-118.ams2.redhat.com [10.36.116.118]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3D7702156880; Wed, 20 Jun 2018 23:29:06 +0000 (UTC) From: Juan Quintela To: qemu-devel@nongnu.org Date: Thu, 21 Jun 2018 01:28:46 +0200 Message-Id: <20180620232851.7152-8-quintela@redhat.com> In-Reply-To: <20180620232851.7152-1-quintela@redhat.com> References: <20180620232851.7152-1-quintela@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 20 Jun 2018 23:29:07 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 20 Jun 2018 23:29:07 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'quintela@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v15 07/12] migration: Synchronize multifd threads with main thread X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, dgilbert@redhat.com, peterx@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" We synchronize all threads each RAM_SAVE_FLAG_EOS. Bitmap synchronizations don't happen inside a ram section, so we are safe about two channels trying to overwrite the same memory. Signed-off-by: Juan Quintela -- seq needs to be atomic now, will also be accessed from main thread. Fix the if (true || ...) leftover --- migration/ram.c | 147 ++++++++++++++++++++++++++++++++--------- migration/trace-events | 6 ++ 2 files changed, 122 insertions(+), 31 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 793f0dc5d3..516f347d24 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -510,6 +510,8 @@ exit: #define MULTIFD_MAGIC 0x11223344U #define MULTIFD_VERSION 1 =20 +#define MULTIFD_FLAG_SYNC (1 << 0) + typedef struct { uint32_t magic; uint32_t version; @@ -577,6 +579,8 @@ typedef struct { uint32_t num_packets; /* pages sent through this channel */ uint32_t num_pages; + /* syncs main thread and channels */ + QemuSemaphore sem_sync; } MultiFDSendParams; =20 typedef struct { @@ -614,6 +618,8 @@ typedef struct { uint32_t num_packets; /* pages sent through this channel */ uint32_t num_pages; + /* syncs main thread and channels */ + QemuSemaphore sem_sync; } MultiFDRecvParams; =20 static int multifd_send_initial_packet(MultiFDSendParams *p, Error **errp) @@ -801,6 +807,10 @@ struct { int count; /* array of pages to sent */ MultiFDPages_t *pages; + /* syncs main thread and channels */ + QemuSemaphore sem_sync; + /* global number of generated multifd packets */ + uint64_t packet_num; } *multifd_send_state; =20 static void multifd_send_terminate_threads(Error *err) @@ -848,6 +858,7 @@ int multifd_save_cleanup(Error **errp) p->c =3D NULL; qemu_mutex_destroy(&p->mutex); qemu_sem_destroy(&p->sem); + qemu_sem_destroy(&p->sem_sync); g_free(p->name); p->name =3D NULL; multifd_pages_clear(p->pages); @@ -856,6 +867,7 @@ int multifd_save_cleanup(Error **errp) g_free(p->packet); p->packet =3D NULL; } + qemu_sem_destroy(&multifd_send_state->sem_sync); g_free(multifd_send_state->params); multifd_send_state->params =3D NULL; multifd_pages_clear(multifd_send_state->pages); @@ -865,6 +877,33 @@ int multifd_save_cleanup(Error **errp) return ret; } =20 +static void multifd_send_sync_main(void) +{ + int i; + + if (!migrate_use_multifd()) { + return; + } + for (i =3D 0; i < migrate_multifd_channels(); i++) { + MultiFDSendParams *p =3D &multifd_send_state->params[i]; + + trace_multifd_send_sync_main_signal(p->id); + + qemu_mutex_lock(&p->mutex); + p->flags |=3D MULTIFD_FLAG_SYNC; + p->pending_job++; + qemu_mutex_unlock(&p->mutex); + qemu_sem_post(&p->sem); + } + for (i =3D 0; i < migrate_multifd_channels(); i++) { + MultiFDSendParams *p =3D &multifd_send_state->params[i]; + + trace_multifd_send_sync_main_wait(p->id); + qemu_sem_wait(&multifd_send_state->sem_sync); + } + trace_multifd_send_sync_main(atomic_read(&multifd_send_state->packet_n= um)); +} + static void *multifd_send_thread(void *opaque) { MultiFDSendParams *p =3D opaque; @@ -901,15 +940,17 @@ static void *multifd_send_thread(void *opaque) qemu_mutex_lock(&p->mutex); p->pending_job--; qemu_mutex_unlock(&p->mutex); - continue; + + if (flags & MULTIFD_FLAG_SYNC) { + qemu_sem_post(&multifd_send_state->sem_sync); + } } else if (p->quit) { qemu_mutex_unlock(&p->mutex); break; + } else { + qemu_mutex_unlock(&p->mutex); + /* sometimes there are spurious wakeups */ } - qemu_mutex_unlock(&p->mutex); - /* this is impossible */ - error_setg(&local_err, "multifd_send_thread: Unknown command"); - break; } =20 out: @@ -961,12 +1002,14 @@ int multifd_save_setup(void) multifd_send_state->params =3D g_new0(MultiFDSendParams, thread_count); atomic_set(&multifd_send_state->count, 0); multifd_send_state->pages =3D multifd_pages_init(page_count); + qemu_sem_init(&multifd_send_state->sem_sync, 0); =20 for (i =3D 0; i < thread_count; i++) { MultiFDSendParams *p =3D &multifd_send_state->params[i]; =20 qemu_mutex_init(&p->mutex); qemu_sem_init(&p->sem, 0); + qemu_sem_init(&p->sem_sync, 0); p->quit =3D false; p->pending_job =3D 0; p->id =3D i; @@ -991,6 +1034,10 @@ struct { MultiFDRecvParams *params; /* number of created threads */ int count; + /* syncs main thread and channels */ + QemuSemaphore sem_sync; + /* global number of generated multifd packets */ + uint64_t packet_num; } *multifd_recv_state; =20 static void multifd_recv_terminate_threads(Error *err) @@ -1036,6 +1083,7 @@ int multifd_load_cleanup(Error **errp) p->c =3D NULL; qemu_mutex_destroy(&p->mutex); qemu_sem_destroy(&p->sem); + qemu_sem_destroy(&p->sem_sync); g_free(p->name); p->name =3D NULL; multifd_pages_clear(p->pages); @@ -1044,6 +1092,7 @@ int multifd_load_cleanup(Error **errp) g_free(p->packet); p->packet =3D NULL; } + qemu_sem_destroy(&multifd_recv_state->sem_sync); g_free(multifd_recv_state->params); multifd_recv_state->params =3D NULL; g_free(multifd_recv_state); @@ -1052,6 +1101,42 @@ int multifd_load_cleanup(Error **errp) return ret; } =20 +static void multifd_recv_sync_main(void) +{ + int i; + + if (!migrate_use_multifd()) { + return; + } + for (i =3D 0; i < migrate_multifd_channels(); i++) { + MultiFDRecvParams *p =3D &multifd_recv_state->params[i]; + + trace_multifd_recv_sync_main_signal(p->id); + qemu_mutex_lock(&p->mutex); + p->pending_job =3D true; + qemu_mutex_unlock(&p->mutex); + } + for (i =3D 0; i < migrate_multifd_channels(); i++) { + MultiFDRecvParams *p =3D &multifd_recv_state->params[i]; + + trace_multifd_recv_sync_main_wait(p->id); + qemu_sem_wait(&multifd_recv_state->sem_sync); + qemu_mutex_lock(&p->mutex); + if (atomic_read(&multifd_recv_state->packet_num) < p->packet_num) { + atomic_set(&multifd_recv_state->packet_num, p->packet_num); + } + qemu_mutex_unlock(&p->mutex); + } + for (i =3D 0; i < migrate_multifd_channels(); i++) { + MultiFDRecvParams *p =3D &multifd_recv_state->params[i]; + + trace_multifd_recv_sync_main_signal(p->id); + + qemu_sem_post(&p->sem_sync); + } + trace_multifd_recv_sync_main(atomic_read(&multifd_recv_state->packet_n= um)); +} + static void *multifd_recv_thread(void *opaque) { MultiFDRecvParams *p =3D opaque; @@ -1061,37 +1146,30 @@ static void *multifd_recv_thread(void *opaque) trace_multifd_recv_thread_start(p->id); =20 while (true) { - qemu_sem_wait(&p->sem); + uint32_t used; + uint32_t flags; + + /* ToDo: recv packet here */ + qemu_mutex_lock(&p->mutex); - if (p->pending_job) { - uint32_t used; - uint32_t flags; - qemu_mutex_unlock(&p->mutex); - - /* ToDo: recv packet here */ - - qemu_mutex_lock(&p->mutex); - ret =3D multifd_recv_unfill_packet(p, &local_err); - if (ret) { - qemu_mutex_unlock(&p->mutex); - break; - } - - used =3D p->pages->used; - flags =3D p->flags; - trace_multifd_recv(p->id, p->packet_num, used, flags); - p->pending_job =3D false; - p->num_packets++; - p->num_pages +=3D used; - qemu_mutex_unlock(&p->mutex); - } else if (p->quit) { + ret =3D multifd_recv_unfill_packet(p, &local_err); + if (ret) { qemu_mutex_unlock(&p->mutex); break; } + + used =3D p->pages->used; + flags =3D p->flags; + trace_multifd_recv(p->id, p->packet_num, used, flags); + p->pending_job =3D false; + p->num_packets++; + p->num_pages +=3D used; qemu_mutex_unlock(&p->mutex); - /* this is impossible */ - error_setg(&local_err, "multifd_recv_thread: Unknown command"); - break; + + if (flags & MULTIFD_FLAG_SYNC) { + qemu_sem_post(&multifd_recv_state->sem_sync); + qemu_sem_wait(&p->sem_sync); + } } =20 if (local_err) { @@ -1119,12 +1197,14 @@ int multifd_load_setup(void) multifd_recv_state =3D g_malloc0(sizeof(*multifd_recv_state)); multifd_recv_state->params =3D g_new0(MultiFDRecvParams, thread_count); atomic_set(&multifd_recv_state->count, 0); + qemu_sem_init(&multifd_recv_state->sem_sync, 0); =20 for (i =3D 0; i < thread_count; i++) { MultiFDRecvParams *p =3D &multifd_recv_state->params[i]; =20 qemu_mutex_init(&p->mutex); qemu_sem_init(&p->sem, 0); + qemu_sem_init(&p->sem_sync, 0); p->quit =3D false; p->pending_job =3D false; p->id =3D i; @@ -2882,6 +2962,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque) ram_control_before_iterate(f, RAM_CONTROL_SETUP); ram_control_after_iterate(f, RAM_CONTROL_SETUP); =20 + multifd_send_sync_main(); qemu_put_be64(f, RAM_SAVE_FLAG_EOS); =20 return 0; @@ -2962,6 +3043,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) */ ram_control_after_iterate(f, RAM_CONTROL_ROUND); =20 + multifd_send_sync_main(); out: qemu_put_be64(f, RAM_SAVE_FLAG_EOS); ram_counters.transferred +=3D 8; @@ -3015,6 +3097,7 @@ static int ram_save_complete(QEMUFile *f, void *opaqu= e) =20 rcu_read_unlock(); =20 + multifd_send_sync_main(); qemu_put_be64(f, RAM_SAVE_FLAG_EOS); =20 return 0; @@ -3504,6 +3587,7 @@ static int ram_load_postcopy(QEMUFile *f) break; case RAM_SAVE_FLAG_EOS: /* normal exit */ + multifd_recv_sync_main(); break; default: error_report("Unknown combination of migration flags: %#x" @@ -3692,6 +3776,7 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) break; case RAM_SAVE_FLAG_EOS: /* normal exit */ + multifd_recv_sync_main(); break; default: if (flags & RAM_SAVE_FLAG_HOOK) { diff --git a/migration/trace-events b/migration/trace-events index c667d98529..ea39bb2bc5 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -77,9 +77,15 @@ migration_bitmap_sync_start(void) "" migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64 migration_throttle(void) "" multifd_recv(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flag= s) "channel %d packet number %ld pages %d flags 0x%x" +multifd_recv_sync_main(uint32_t seq) "seq %d" +multifd_recv_sync_main_signal(uint8_t id) "channel %d" +multifd_recv_sync_main_wait(uint8_t id) "channel %d" multifd_recv_thread_end(uint8_t id, uint32_t packets, uint32_t pages) "cha= nnel %d packets %d pages %d" multifd_recv_thread_start(uint8_t id) "%d" multifd_send(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flag= s) "channel %d packet_num %ld pages %d flags 0x%x" +multifd_send_sync_main(uint32_t seq) "seq %d" +multifd_send_sync_main_signal(uint8_t id) "channel %d" +multifd_send_sync_main_wait(uint8_t id) "channel %d" multifd_send_thread_end(uint8_t id, uint32_t packets, uint32_t pages) "cha= nnel %d packets %d pages %d" multifd_send_thread_start(uint8_t id) "%d" ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: sta= rt: %" PRIx64 " %zx" --=20 2.17.1