From nobody Sun Feb 8 05:43:16 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1625280866445644.0712968302895; Fri, 2 Jul 2021 19:54:26 -0700 (PDT) Received: from localhost ([::1]:56404 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lzVnU-0008CE-R2 for importer@patchew.org; Fri, 02 Jul 2021 22:54:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56220) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzVmq-0007Vu-TT for qemu-devel@nongnu.org; Fri, 02 Jul 2021 22:53:44 -0400 Received: from mga14.intel.com ([192.55.52.115]:32303) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzVmm-0006zl-RW for qemu-devel@nongnu.org; Fri, 02 Jul 2021 22:53:44 -0400 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jul 2021 19:53:30 -0700 Received: from fmsmsx604.amr.corp.intel.com ([10.18.126.84]) by orsmga006.jf.intel.com with ESMTP; 02 Jul 2021 19:53:29 -0700 Received: from shsmsx602.ccr.corp.intel.com (10.109.6.142) by fmsmsx604.amr.corp.intel.com (10.18.126.84) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.4; Fri, 2 Jul 2021 19:53:29 -0700 Received: from shsmsx601.ccr.corp.intel.com (10.109.6.141) by SHSMSX602.ccr.corp.intel.com (10.109.6.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.4; Sat, 3 Jul 2021 10:53:27 +0800 Received: from shsmsx601.ccr.corp.intel.com ([10.109.6.141]) by SHSMSX601.ccr.corp.intel.com ([10.109.6.141]) with mapi id 15.01.2242.008; Sat, 3 Jul 2021 10:53:27 +0800 X-IronPort-AV: E=McAfee;i="6200,9189,10033"; a="208623103" X-IronPort-AV: E=Sophos;i="5.83,320,1616482800"; d="scan'208";a="208623103" X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,320,1616482800"; d="scan'208";a="409444437" From: "Wang, Wei W" To: David Hildenbrand , Peter Xu Subject: RE: [PATCH] migration: Move bitmap_mutex out of migration_bitmap_clear_dirty() Thread-Topic: [PATCH] migration: Move bitmap_mutex out of migration_bitmap_clear_dirty() Thread-Index: AQHXbeunvJLboMuimkuCIYxaPrgzEKsth5XAgAAGCwCAABk7AIABVF1Q///EdQCAAc2LkA== Date: Sat, 3 Jul 2021 02:53:27 +0000 Message-ID: <562b42cbd5674853af21be3297fbaada@intel.com> References: <20210630200805.280905-1-peterx@redhat.com> <33f137dae5c346078a3a7a658bb5f1ab@intel.com> <304fc749-03a0-b58d-05cc-f0d78350e015@redhat.com> <604935aa45114d889800f6ccc23c6b13@intel.com> <824a1d77-eab0-239f-5104-49c49d6ad285@redhat.com> In-Reply-To: <824a1d77-eab0-239f-5104-49c49d6ad285@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.55.52.115; envelope-from=wei.w.wang@intel.com; helo=mga14.intel.com X-Spam_score_int: -68 X-Spam_score: -6.9 X-Spam_bar: ------ X-Spam_report: (-6.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hailiang Zhang , Juan Quintela , "qemu-devel@nongnu.org" , "Dr . David Alan Gilbert" , Leonardo Bras Soares Passos Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1625280872635100001 On Friday, July 2, 2021 3:07 PM, David Hildenbrand wrote: > On 02.07.21 04:48, Wang, Wei W wrote: > > On Thursday, July 1, 2021 10:22 PM, David Hildenbrand wrote: > >> On 01.07.21 14:51, Peter Xu wrote: >=20 > I think that clearly shows the issue. >=20 > My theory I did not verify yet: Assume we have 1GB chunks in the clear bm= ap. > Assume the VM reports all pages within a 1GB chunk as free (easy with a f= resh > VM). While processing hints, we will clear the bits from the dirty bmap i= n the > RAMBlock. As we will never migrate any page of that 1GB chunk, we will not > actually clear the dirty bitmap of the memory region. When re-syncing, we= will > set all bits bits in the dirty bmap again from the dirty bitmap in the me= mory > region. Thus, many of our hints end up being mostly ignored. The smaller = the > clear bmap chunk, the more extreme the issue. OK, that looks possible. We need to clear the related bits from the memory = region bitmap before skipping the free pages. Could you try with below patch: diff --git a/migration/ram.c b/migration/ram.c index ace8ad431c..a1f6df3e6c 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -811,6 +811,26 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs= , RAMBlock *rb, return next; } + +static void migration_clear_memory_region_dirty_bitmap(RAMState *rs, + RAMBlock *rb, + unsigned long page) +{ + uint8_t shift; + hwaddr size, start; + + if (!rb->clear_bmap || !clear_bmap_test_and_clear(rb, page)) + return; + + shift =3D rb->clear_bmap_shift; + assert(shift >=3D 6); + + size =3D 1ULL << (TARGET_PAGE_BITS + shift); + start =3D (((ram_addr_t)page) << TARGET_PAGE_BITS) & (-size); + trace_migration_bitmap_clear_dirty(rb->idstr, start, size, page); + memory_region_clear_dirty_bitmap(rb->mr, start, size); +} + static inline bool migration_bitmap_clear_dirty(RAMState *rs, RAMBlock *rb, unsigned long page) @@ -827,26 +847,9 @@ static inline bool migration_bitmap_clear_dirty(RAMSta= te *rs, * the page in the chunk we clear the remote dirty bitmap for all. * Clearing it earlier won't be a problem, but too late will. */ - if (rb->clear_bmap && clear_bmap_test_and_clear(rb, page)) { - uint8_t shift =3D rb->clear_bmap_shift; - hwaddr size =3D 1ULL << (TARGET_PAGE_BITS + shift); - hwaddr start =3D (((ram_addr_t)page) << TARGET_PAGE_BITS) & (-size= ); - - /* - * CLEAR_BITMAP_SHIFT_MIN should always guarantee this... this - * can make things easier sometimes since then start address - * of the small chunk will always be 64 pages aligned so the - * bitmap will always be aligned to unsigned long. We should - * even be able to remove this restriction but I'm simply - * keeping it. - */ - assert(shift >=3D 6); - trace_migration_bitmap_clear_dirty(rb->idstr, start, size, page); - memory_region_clear_dirty_bitmap(rb->mr, start, size); - } + migration_clear_memory_region_dirty_bitmap(rs, rb, page); ret =3D test_and_clear_bit(page, rb->bmap); - if (ret) { rs->migration_dirty_pages--; } @@ -2746,7 +2749,7 @@ void qemu_guest_free_page_hint(void *addr, size_t len) { RAMBlock *block; ram_addr_t offset; - size_t used_len, start, npages; + size_t used_len, start, npages, page_to_clear, i =3D 0; MigrationState *s =3D migrate_get_current(); /* This function is currently expected to be used during live migratio= n */ @@ -2775,6 +2778,19 @@ void qemu_guest_free_page_hint(void *addr, size_t le= n) start =3D offset >> TARGET_PAGE_BITS; npages =3D used_len >> TARGET_PAGE_BITS; + /* + * The skipped free pages are equavelent to be sent from clear_bma= p's + * perspective, so clear the bits from the memory region bitmap whi= ch + * are initially set. Otherwise those skipped pages will be sent in + * the next round after syncing from the memory region bitmap. + */ + /* + * The skipped free pages are equavelent to be sent from clear_bma= p's + * perspective, so clear the bits from the memory region bitmap whi= ch + * are initially set. Otherwise those skipped pages will be sent in + * the next round after syncing from the memory region bitmap. + */ + do { + page_to_clear =3D start + (i++ << block->clear_bmap_shift); + migration_clear_memory_region_dirty_bitmap(ram_state, + block, + page_to_clear); + } while (i <=3D npages >> block->clear_bmap_shift); + qemu_mutex_lock(&ram_state->bitmap_mutex); ram_state->migration_dirty_pages -=3D bitmap_count_one_with_offset(block->bmap, start, npa= ges); Best, Wei