From nobody Wed May 8 16:53:04 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1659421400; cv=none; d=zohomail.com; s=zohoarc; b=UyDj9WvYxWbayL4K5ah1dMY6spUiNv2zqA944iOgV6eqIwocz3gF5Wl6oj+jLU+w6tvoHdzOjoTzsLMHO4K4Z9HCy6peftIPgACGJdy5SX/FJ7g0mo2f9m/+t12+ubBdBo80zHTxvd5m7R6mAvyAd8zp4+dB81i+k62QaC5trBA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1659421400; h=Content-Transfer-Encoding:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=WkzXCabTuKxVAI3S7XBy6Bgpt95BXXwTSi5dgLfHCaU=; b=ThzjsKC7I9N2goiIqmpiQIJWP9SaCZ/fmesNGJFfDp6ixqPWR+NngR5R/1ph9OFcoiVj0Z4Da8mGgSgDw6fw30RMPl24uxfHbyPUb0KHS3rg/Ldv525R6KAo0RFhI7iS0GCsbGGv+yAPLrR/n7X/9UrwUoQ5mqSLODTvjvGivNI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1659421400122607.7710289582434; Mon, 1 Aug 2022 23:23:20 -0700 (PDT) Received: from localhost ([::1]:58104 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oIlJG-000051-Pj for importer@patchew.org; Tue, 02 Aug 2022 02:23:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39494) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oIlG4-00074K-VT for qemu-devel@nongnu.org; Tue, 02 Aug 2022 02:20:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:37625) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oIlFz-0005ax-JA for qemu-devel@nongnu.org; Tue, 02 Aug 2022 02:19:57 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-307-w0ZFlQotNlSHWUephyn_iA-1; Tue, 02 Aug 2022 02:19:53 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2052C811E76 for ; Tue, 2 Aug 2022 06:19:53 +0000 (UTC) Received: from thuth.com (unknown [10.39.192.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id F116790A04; Tue, 2 Aug 2022 06:19:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1659421194; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=WkzXCabTuKxVAI3S7XBy6Bgpt95BXXwTSi5dgLfHCaU=; b=aEDXl2kqjNO6chzGx4ojcuBd2k3ZAlJOE2di9qXluvIImZxbcS1Wu0t7DOw62w4wE1GhDN pr0WLV540uDaeI6TlyEfrFMNs8l+IQA4E5nDtOBbL7ndD53OyFj6s9K5YvENBR1Y21vC/O 6X7k5sGAXZ5YNqMzlyQk4nDd5Lg4nTY= X-MC-Unique: w0ZFlQotNlSHWUephyn_iA-1 From: Thomas Huth To: Juan Quintela , "Dr. David Alan Gilbert" , peterx@redhat.com, qemu-devel@nongnu.org Subject: [PATCH for-7.1] Revert "migration: Simplify unqueue_page()" Date: Tue, 2 Aug 2022 08:19:49 +0200 Message-Id: <20220802061949.331576-1-thuth@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=thuth@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -22 X-Spam_score: -2.3 X-Spam_bar: -- X-Spam_report: (-2.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URG_BIZ=0.573 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1659421402325100001 Content-Type: text/plain; charset="utf-8" This reverts commit cfd66f30fb0f735df06ff4220e5000290a43dad3. The simplification of unqueue_page() introduced a bug that sometimes breaks migration on s390x hosts. Seems like there are still pages here that do not have their dirty bit set. The problem is not fully understood yet, but since we are already in the freeze for QEMU 7.1 and we need something working there, let's revert this patch for the upcoming release. The optimization can be redone later again in a proper way if necessary. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=3D2099934 Signed-off-by: Thomas Huth Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 37 ++++++++++++++++++++++++++----------- migration/trace-events | 3 ++- 2 files changed, 28 insertions(+), 12 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index b94669ba5d..dc1de9ddbc 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1612,7 +1612,6 @@ static RAMBlock *unqueue_page(RAMState *rs, ram_addr_= t *offset) { struct RAMSrcPageRequest *entry; RAMBlock *block =3D NULL; - size_t page_size; =20 if (!postcopy_has_request(rs)) { return NULL; @@ -1629,13 +1628,10 @@ static RAMBlock *unqueue_page(RAMState *rs, ram_add= r_t *offset) entry =3D QSIMPLEQ_FIRST(&rs->src_page_requests); block =3D entry->rb; *offset =3D entry->offset; - page_size =3D qemu_ram_pagesize(block); - /* Each page request should only be multiple page size of the ramblock= */ - assert((entry->len % page_size) =3D=3D 0); =20 - if (entry->len > page_size) { - entry->len -=3D page_size; - entry->offset +=3D page_size; + if (entry->len > TARGET_PAGE_SIZE) { + entry->len -=3D TARGET_PAGE_SIZE; + entry->offset +=3D TARGET_PAGE_SIZE; } else { memory_region_unref(block->mr); QSIMPLEQ_REMOVE_HEAD(&rs->src_page_requests, next_req); @@ -1643,9 +1639,6 @@ static RAMBlock *unqueue_page(RAMState *rs, ram_addr_= t *offset) migration_consume_urgent_request(); } =20 - trace_unqueue_page(block->idstr, *offset, - test_bit((*offset >> TARGET_PAGE_BITS), block->bmap= )); - return block; } =20 @@ -2069,8 +2062,30 @@ static bool get_queued_page(RAMState *rs, PageSearch= Status *pss) { RAMBlock *block; ram_addr_t offset; + bool dirty; + + do { + block =3D unqueue_page(rs, &offset); + /* + * We're sending this page, and since it's postcopy nothing else + * will dirty it, and we must make sure it doesn't get sent again + * even if this queue request was received after the background + * search already sent it. + */ + if (block) { + unsigned long page; + + page =3D offset >> TARGET_PAGE_BITS; + dirty =3D test_bit(page, block->bmap); + if (!dirty) { + trace_get_queued_page_not_dirty(block->idstr, (uint64_t)of= fset, + page); + } else { + trace_get_queued_page(block->idstr, (uint64_t)offset, page= ); + } + } =20 - block =3D unqueue_page(rs, &offset); + } while (block && !dirty); =20 if (block) { /* See comment above postcopy_preempted_contains() */ diff --git a/migration/trace-events b/migration/trace-events index a34afe7b85..57003edcbd 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -85,6 +85,8 @@ put_qlist_end(const char *field_name, const char *vmsd_na= me) "%s(%s)" qemu_file_fclose(void) "" =20 # ram.c +get_queued_page(const char *block_name, uint64_t tmp_offset, unsigned long= page_abs) "%s/0x%" PRIx64 " page_abs=3D0x%lx" +get_queued_page_not_dirty(const char *block_name, uint64_t tmp_offset, uns= igned long page_abs) "%s/0x%" PRIx64 " page_abs=3D0x%lx" migration_bitmap_sync_start(void) "" migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, uns= igned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx" @@ -110,7 +112,6 @@ ram_save_iterate_big_wait(uint64_t milliconds, int iter= ations) "big wait: %" PRI ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration = %" PRIu64 ram_write_tracking_ramblock_start(const char *block_id, size_t page_size, = void *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu" ram_write_tracking_ramblock_stop(const char *block_id, size_t page_size, v= oid *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu" -unqueue_page(char *block, uint64_t offset, bool dirty) "ramblock '%s' offs= et 0x%"PRIx64" dirty %d" postcopy_preempt_triggered(char *str, unsigned long page) "during sending = ramblock %s offset 0x%lx" postcopy_preempt_restored(char *str, unsigned long page) "ramblock %s offs= et 0x%lx" postcopy_preempt_hit(char *str, uint64_t offset) "ramblock %s offset 0x%"P= RIx64 --=20 2.31.1