From nobody Fri May 3 20:09:00 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1493231899502922.7907065478392; Wed, 26 Apr 2017 11:38:19 -0700 (PDT) Received: from localhost ([::1]:56642 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d3RpQ-0000aV-LQ for importer@patchew.org; Wed, 26 Apr 2017 14:38:16 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41209) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d3Roi-0000I7-Jz for qemu-devel@nongnu.org; Wed, 26 Apr 2017 14:37:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d3Rog-0006Y5-FE for qemu-devel@nongnu.org; Wed, 26 Apr 2017 14:37:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41064) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1d3Rog-0006Xr-5Z for qemu-devel@nongnu.org; Wed, 26 Apr 2017 14:37:30 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0F35B69986; Wed, 26 Apr 2017 18:37:29 +0000 (UTC) Received: from dgilbert-t530.redhat.com (unknown [10.36.118.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id AB04517151; Wed, 26 Apr 2017 18:37:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 0F35B69986 Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 0F35B69986 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org, borntraeger@de.ibm.com, quintela@redhat.com, lvivier@redhat.com, peterx@redhat.com Date: Wed, 26 Apr 2017 19:37:20 +0100 Message-Id: <20170426183721.7482-2-dgilbert@redhat.com> In-Reply-To: <20170426183721.7482-1-dgilbert@redhat.com> References: <20170426183721.7482-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 26 Apr 2017 18:37:29 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" When an all-zero page is received during the precopy phase of a postcopy-enabled migration we must force allocation otherwise accesses to the page will still get blocked by userfault. Symptom: a) If the page is accessed by a device during device-load then we get a deadlock as the source finishes sending all its pages but the destination device-load is still paused and so doesn't clean up. b) If the page is accessed later, then the thread will stay paused until the end of migration rather than carrying on running, until we release userfault at the end. Signed-off-by: Dr. David Alan Gilbert Reported-by: Christian Borntraeger Tested-by: Christian Borntraeger --- include/migration/migration.h | 3 ++- migration/ram.c | 12 ++++++++---- migration/rdma.c | 2 +- 3 files changed, 11 insertions(+), 6 deletions(-) diff --git a/include/migration/migration.h b/include/migration/migration.h index ba1a16cbc1..b47904033c 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -265,7 +265,8 @@ uint64_t xbzrle_mig_pages_overflow(void); uint64_t xbzrle_mig_pages_cache_miss(void); double xbzrle_mig_cache_miss_rate(void); =20 -void ram_handle_compressed(void *host, uint8_t ch, uint64_t size); +void ram_handle_compressed(void *host, uint8_t ch, uint64_t size, + bool always_write); void ram_debug_dump_bitmap(unsigned long *todump, bool expected); /* For outgoing discard bitmap */ int ram_postcopy_send_discard_bitmap(MigrationState *ms); diff --git a/migration/ram.c b/migration/ram.c index f48664ec62..b4ed41c725 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2274,10 +2274,12 @@ static inline void *host_from_ram_block_offset(RAMB= lock *block, * @host: host address for the zero page * @ch: what the page is filled from. We only support zero * @size: size of the zero page + * @always_write: Always perform the memset even if it's zero */ -void ram_handle_compressed(void *host, uint8_t ch, uint64_t size) +void ram_handle_compressed(void *host, uint8_t ch, uint64_t size, + bool always_write) { - if (ch !=3D 0 || !is_zero_range(host, size)) { + if (ch !=3D 0 || always_write || !is_zero_range(host, size)) { memset(host, ch, size); } } @@ -2514,7 +2516,8 @@ static int ram_load_postcopy(QEMUFile *f) switch (flags & ~RAM_SAVE_FLAG_CONTINUE) { case RAM_SAVE_FLAG_COMPRESS: ch =3D qemu_get_byte(f); - memset(page_buffer, ch, TARGET_PAGE_SIZE); + ram_handle_compressed(page_buffer, ch, TARGET_PAGE_SIZE, + true); if (ch) { all_zero =3D false; } @@ -2664,7 +2667,8 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) =20 case RAM_SAVE_FLAG_COMPRESS: ch =3D qemu_get_byte(f); - ram_handle_compressed(host, ch, TARGET_PAGE_SIZE); + ram_handle_compressed(host, ch, TARGET_PAGE_SIZE, + postcopy_advised); break; =20 case RAM_SAVE_FLAG_PAGE: diff --git a/migration/rdma.c b/migration/rdma.c index fe0a4b5a83..07a9bd75d8 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -3164,7 +3164,7 @@ static int qemu_rdma_registration_handle(QEMUFile *f,= void *opaque) host_addr =3D block->local_host_addr + (comp->offset - block->offset); =20 - ram_handle_compressed(host_addr, comp->value, comp->length); + ram_handle_compressed(host_addr, comp->value, comp->length, fa= lse); break; =20 case RDMA_CONTROL_REGISTER_FINISHED: --=20 2.12.2 From nobody Fri May 3 20:09:00 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1493232087647594.4677148433223; Wed, 26 Apr 2017 11:41:27 -0700 (PDT) Received: from localhost ([::1]:56656 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d3RsU-0002i8-77 for importer@patchew.org; Wed, 26 Apr 2017 14:41:26 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41213) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d3Roi-0000I9-Ll for qemu-devel@nongnu.org; Wed, 26 Apr 2017 14:37:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d3Roh-0006YQ-OP for qemu-devel@nongnu.org; Wed, 26 Apr 2017 14:37:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33476) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1d3Roh-0006YF-Hm for qemu-devel@nongnu.org; Wed, 26 Apr 2017 14:37:31 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7AD12A7328; Wed, 26 Apr 2017 18:37:30 +0000 (UTC) Received: from dgilbert-t530.redhat.com (unknown [10.36.118.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 548F217151; Wed, 26 Apr 2017 18:37:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 7AD12A7328 Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 7AD12A7328 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org, borntraeger@de.ibm.com, quintela@redhat.com, lvivier@redhat.com, peterx@redhat.com Date: Wed, 26 Apr 2017 19:37:21 +0100 Message-Id: <20170426183721.7482-3-dgilbert@redhat.com> In-Reply-To: <20170426183721.7482-1-dgilbert@redhat.com> References: <20170426183721.7482-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 26 Apr 2017 18:37:30 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 2/2] migration: Extra tracing X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" A couple more traces that would have made fixing that postcopy bug a bit easier. Signed-off-by: Dr. David Alan Gilbert Acked-by: Christian Borntraeger Reviewed-by: Philippe Mathieu-Daud=C3=A9 --- migration/ram.c | 2 ++ migration/trace-events | 2 ++ 2 files changed, 4 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index b4ed41c725..3ac41ccaba 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -812,6 +812,7 @@ static int ram_save_page(RAMState *rs, PageSearchStatus= *pss, bool last_stage) ram_addr_t offset =3D pss->page << TARGET_PAGE_BITS; =20 p =3D block->host + offset; + trace_ram_save_page(block->idstr, (uint64_t)offset, p); =20 /* In doubt sent page as normal */ bytes_xmit =3D 0; @@ -2614,6 +2615,7 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) ret =3D -EINVAL; break; } + trace_ram_load_loop(block->idstr, (uint64_t)addr, flags, host); } =20 switch (flags & ~RAM_SAVE_FLAG_CONTINUE) { diff --git a/migration/trace-events b/migration/trace-events index b8f01a218c..5b8ccf301c 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -69,8 +69,10 @@ migration_bitmap_sync_start(void) "" migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64 migration_throttle(void) "" ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: sta= rt: %" PRIx64 " %zx" +ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%= s: addr: %" PRIx64 " flags: %x host: %p" ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x" ram_postcopy_send_discard_bitmap(void) "" +ram_save_page(const char *rbname, uint64_t offset, void *host) "%s: offset= : %" PRIx64 " host: %p" ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: st= art: %zx len: %zx" =20 # migration/migration.c --=20 2.12.2