From nobody Sun Feb 8 16:34:07 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1677594038; cv=none; d=zohomail.com; s=zohoarc; b=kFWH4a9GvnDzfcHFSUMcUaPgF0sQwoKZy4VGLS6KwPF4ch0r+v72MeJZzVE0C05goxOoPvLglaoGu/veCl9sb5d3ghqMi4b83ErsWNV5z56P4QgB/iV6lw6G06gm4/CbyMNlzvV3XewYvJLvsX0rQExROx4iU/KnoO6DeaMnfeo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1677594038; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:References:Sender:Subject:To; bh=HDsDXdDGkPHq8kbnzCBMsUKHTIYPTEKyi5KHEiGViVc=; b=njb8/JOsUMtkATU4idN8ksiieqJ0yt5flb7EglIEt01hu8xI95DdSozsah/uunTb+Hx7jwlioQ+DeQzyYD5mQAcbLHPaRCTS/jhhK50GZVZjhTI2oXGJQevHwTPb/SfXKgqjhC88S+00GRfrJqijdpFK5sp8prIx3U+TMgVmH/U= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1677594038410928.2133923487038; Tue, 28 Feb 2023 06:20:38 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pX0pU-0001Se-6Q; Tue, 28 Feb 2023 09:19:44 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pWzrL-0001Ex-I3 for qemu-devel@nongnu.org; Tue, 28 Feb 2023 08:17:35 -0500 Received: from frasgout.his.huawei.com ([185.176.79.56]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pWzrF-0002rt-1B for qemu-devel@nongnu.org; Tue, 28 Feb 2023 08:17:35 -0500 Received: from lhrpeml500004.china.huawei.com (unknown [172.18.147.207]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4PQyX03NNYz6J9gs; Tue, 28 Feb 2023 21:14:56 +0800 (CST) Received: from DESKTOP-0LHM7NF.huawei.com (10.199.58.101) by lhrpeml500004.china.huawei.com (7.191.163.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Tue, 28 Feb 2023 13:17:13 +0000 To: CC: , , Andrei Gudkov Subject: [PATCH 1/2] migration/calc-dirty-rate: new metrics in sampling mode Date: Tue, 28 Feb 2023 16:16:02 +0300 Message-ID: X-Mailer: git-send-email 2.30.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.199.58.101] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To lhrpeml500004.china.huawei.com (7.191.163.9) X-CFilter-Loop: Reflected Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=185.176.79.56; envelope-from=gudkov.andrei@huawei.com; helo=frasgout.his.huawei.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Tue, 28 Feb 2023 09:19:40 -0500 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Andrei Gudkov From: Andrei Gudkov via Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1677594040559100014 Content-Type: text/plain; charset="utf-8" * Collect number of all-zero pages * Collect vector of number of dirty pages for different time periods * Report total number of pages, number of sampled pages and page size * Replaced CRC32 with xxHash for performance reasons Signed-off-by: Andrei Gudkov --- migration/dirtyrate.c | 219 +++++++++++++++++++++++++++++++++--------- migration/dirtyrate.h | 26 ++++- qapi/migration.json | 25 +++++ 3 files changed, 218 insertions(+), 52 deletions(-) diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c index 575d48c397..cb5dc579c7 100644 --- a/migration/dirtyrate.c +++ b/migration/dirtyrate.c @@ -28,6 +28,7 @@ #include "sysemu/kvm.h" #include "sysemu/runstate.h" #include "exec/memory.h" +#include "qemu/xxhash.h" =20 /* * total_dirty_pages is procted by BQL and is used @@ -222,6 +223,7 @@ static struct DirtyRateInfo *query_dirty_rate_info(void) info->calc_time =3D DirtyStat.calc_time; info->sample_pages =3D DirtyStat.sample_pages; info->mode =3D dirtyrate_mode; + info->page_size =3D TARGET_PAGE_SIZE; =20 if (qatomic_read(&CalculatingState) =3D=3D DIRTY_RATE_STATUS_MEASURED)= { info->has_dirty_rate =3D true; @@ -243,6 +245,32 @@ static struct DirtyRateInfo *query_dirty_rate_info(voi= d) info->vcpu_dirty_rate =3D head; } =20 + if (dirtyrate_mode =3D=3D DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING) { + int64List *periods_head =3D NULL; + int64List **periods_tail =3D &periods_head; + int64List *n_dirty_pages_head =3D NULL; + int64List **n_dirty_pages_tail =3D &n_dirty_pages_head; + + info->n_total_pages =3D DirtyStat.page_sampling.n_total_pages; + info->has_n_total_pages =3D true; + + info->n_sampled_pages =3D DirtyStat.page_sampling.n_sampled_pa= ges; + info->has_n_sampled_pages =3D true; + + info->n_zero_pages =3D DirtyStat.page_sampling.n_zero_pages; + info->has_n_zero_pages =3D true; + + for (i =3D 0; i < DirtyStat.page_sampling.n_readings; i++) { + DirtyReading *dr =3D &DirtyStat.page_sampling.readings[i]; + QAPI_LIST_APPEND(periods_tail, dr->period); + QAPI_LIST_APPEND(n_dirty_pages_tail, dr->n_dirty_pages); + } + info->n_dirty_pages =3D n_dirty_pages_head; + info->periods =3D periods_head; + info->has_n_dirty_pages =3D true; + info->has_periods =3D true; + } + if (dirtyrate_mode =3D=3D DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) { info->sample_pages =3D 0; } @@ -263,9 +291,12 @@ static void init_dirtyrate_stat(int64_t start_time, =20 switch (config.mode) { case DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING: - DirtyStat.page_sampling.total_dirty_samples =3D 0; - DirtyStat.page_sampling.total_sample_count =3D 0; - DirtyStat.page_sampling.total_block_mem_MB =3D 0; + DirtyStat.page_sampling.n_total_pages =3D 0; + DirtyStat.page_sampling.n_sampled_pages =3D 0; + DirtyStat.page_sampling.n_zero_pages =3D 0; + DirtyStat.page_sampling.n_readings =3D 0; + DirtyStat.page_sampling.readings =3D g_try_malloc0_n(MAX_DIRTY_REA= DINGS, + sizeof(DirtyRead= ing)); break; case DIRTY_RATE_MEASURE_MODE_DIRTY_RING: DirtyStat.dirty_ring.nvcpu =3D -1; @@ -283,28 +314,58 @@ static void cleanup_dirtyrate_stat(struct DirtyRateCo= nfig config) free(DirtyStat.dirty_ring.rates); DirtyStat.dirty_ring.rates =3D NULL; } + if (DirtyStat.page_sampling.readings) { + free(DirtyStat.page_sampling.readings); + DirtyStat.page_sampling.readings =3D NULL; + } } =20 -static void update_dirtyrate_stat(struct RamblockDirtyInfo *info) -{ - DirtyStat.page_sampling.total_dirty_samples +=3D info->sample_dirty_co= unt; - DirtyStat.page_sampling.total_sample_count +=3D info->sample_pages_cou= nt; - /* size of total pages in MB */ - DirtyStat.page_sampling.total_block_mem_MB +=3D (info->ramblock_pages * - TARGET_PAGE_SIZE) >> 20; +/* + * Compute hash of a single page of size TARGET_PAGE_SIZE. + * If ptr is NULL, then compute hash of a page entirely filled with zeros. + */ +static uint32_t compute_page_hash(void *ptr) +{ + uint32_t i; + uint64_t v1, v2, v3, v4; + uint64_t res; + const uint64_t *p =3D ptr; + + v1 =3D QEMU_XXHASH_SEED + XXH_PRIME64_1 + XXH_PRIME64_2; + v2 =3D QEMU_XXHASH_SEED + XXH_PRIME64_2; + v3 =3D QEMU_XXHASH_SEED + 0; + v4 =3D QEMU_XXHASH_SEED - XXH_PRIME64_1; + if (ptr) { + for (i =3D 0; i < TARGET_PAGE_SIZE / 8; i +=3D 4) { + v1 =3D XXH64_round(v1, p[i + 0]); + v2 =3D XXH64_round(v2, p[i + 1]); + v3 =3D XXH64_round(v3, p[i + 2]); + v4 =3D XXH64_round(v4, p[i + 3]); + } + } else { + for (i =3D 0; i < TARGET_PAGE_SIZE / 8; i +=3D 4) { + v1 =3D XXH64_round(v1, 0); + v2 =3D XXH64_round(v2, 0); + v3 =3D XXH64_round(v3, 0); + v4 =3D XXH64_round(v4, 0); + } + } + res =3D XXH64_mergerounds(v1, v2, v3, v4); + res +=3D TARGET_PAGE_SIZE; + res =3D XXH64_avalanche(res); + return (uint32_t)(res & UINT32_MAX); } =20 -static void update_dirtyrate(uint64_t msec) +static uint32_t get_zero_page_hash(void) { - uint64_t dirtyrate; - uint64_t total_dirty_samples =3D DirtyStat.page_sampling.total_dirty_s= amples; - uint64_t total_sample_count =3D DirtyStat.page_sampling.total_sample_c= ount; - uint64_t total_block_mem_MB =3D DirtyStat.page_sampling.total_block_me= m_MB; + static uint32_t hash; + static int is_computed; =20 - dirtyrate =3D total_dirty_samples * total_block_mem_MB * - 1000 / (total_sample_count * msec); - - DirtyStat.dirty_rate =3D dirtyrate; + if (!is_computed) { + hash =3D compute_page_hash(NULL); + is_computed =3D 1; + } + return hash; } =20 /* @@ -314,13 +375,10 @@ static void update_dirtyrate(uint64_t msec) static uint32_t get_ramblock_vfn_hash(struct RamblockDirtyInfo *info, uint64_t vfn) { - uint32_t crc; - - crc =3D crc32(0, (info->ramblock_addr + - vfn * TARGET_PAGE_SIZE), TARGET_PAGE_SIZE); - - trace_get_ramblock_vfn_hash(info->idstr, vfn, crc); - return crc; + uint32_t hash; + hash =3D compute_page_hash(info->ramblock_addr + vfn * TARGET_PAGE_SIZ= E); + trace_get_ramblock_vfn_hash(info->idstr, vfn, hash); + return hash; } =20 static bool save_ramblock_hash(struct RamblockDirtyInfo *info) @@ -328,6 +386,7 @@ static bool save_ramblock_hash(struct RamblockDirtyInfo= *info) unsigned int sample_pages_count; int i; GRand *rand; + uint32_t zero_page_hash =3D get_zero_page_hash(); =20 sample_pages_count =3D info->sample_pages_count; =20 @@ -349,12 +408,17 @@ static bool save_ramblock_hash(struct RamblockDirtyIn= fo *info) return false; } =20 - rand =3D g_rand_new(); + rand =3D g_rand_new(); + DirtyStat.page_sampling.n_total_pages +=3D info->ramblock_pages; for (i =3D 0; i < sample_pages_count; i++) { info->sample_page_vfn[i] =3D g_rand_int_range(rand, 0, info->ramblock_pages -= 1); info->hash_result[i] =3D get_ramblock_vfn_hash(info, info->sample_page_vfn= [i]); + DirtyStat.page_sampling.n_sampled_pages++; + if (info->hash_result[i] =3D=3D zero_page_hash) { + DirtyStat.page_sampling.n_zero_pages++; + } } g_rand_free(rand); =20 @@ -451,18 +515,20 @@ out: return ret; } =20 -static void calc_page_dirty_rate(struct RamblockDirtyInfo *info) +static int64_t calc_page_dirty_rate(struct RamblockDirtyInfo *info) { uint32_t crc; int i; =20 + int64_t n_dirty =3D 0; for (i =3D 0; i < info->sample_pages_count; i++) { crc =3D get_ramblock_vfn_hash(info, info->sample_page_vfn[i]); if (crc !=3D info->hash_result[i]) { + n_dirty++; trace_calc_page_dirty_rate(info->idstr, crc, info->hash_result= [i]); - info->sample_dirty_count++; } } + return n_dirty; } =20 static struct RamblockDirtyInfo * @@ -491,11 +557,12 @@ find_block_matched(RAMBlock *block, int count, return &infos[i]; } =20 -static bool compare_page_hash_info(struct RamblockDirtyInfo *info, +static int64_t compare_page_hash_info(struct RamblockDirtyInfo *info, int block_count) { struct RamblockDirtyInfo *block_dinfo =3D NULL; RAMBlock *block =3D NULL; + int64_t n_dirty =3D 0; =20 RAMBLOCK_FOREACH_MIGRATABLE(block) { if (skip_sample_ramblock(block)) { @@ -505,15 +572,10 @@ static bool compare_page_hash_info(struct RamblockDir= tyInfo *info, if (block_dinfo =3D=3D NULL) { continue; } - calc_page_dirty_rate(block_dinfo); - update_dirtyrate_stat(block_dinfo); - } - - if (DirtyStat.page_sampling.total_sample_count =3D=3D 0) { - return false; + n_dirty +=3D calc_page_dirty_rate(block_dinfo); } =20 - return true; + return n_dirty; } =20 static inline void record_dirtypages_bitmap(DirtyPageRecord *dirty_pages, @@ -544,6 +606,8 @@ static void calculate_dirtyrate_dirty_bitmap(struct Dir= tyRateConfig config) int64_t start_time; DirtyPageRecord dirty_pages; =20 + + qemu_mutex_lock_iothread(); memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE); =20 @@ -614,13 +678,40 @@ static void calculate_dirtyrate_dirty_ring(struct Dir= tyRateConfig config) DirtyStat.dirty_rate =3D dirtyrate_sum; } =20 +static int64_t increase_period(int64_t prev_period, int64_t max_period) +{ + int64_t delta; + int64_t next_period; + + if (prev_period < 500) { + delta =3D 125; + } else if (prev_period < 1000) { + delta =3D 250; + } else if (prev_period < 2000) { + delta =3D 500; + } else if (prev_period < 4000) { + delta =3D 1000; + } else if (prev_period < 10000) { + delta =3D 2000; + } else { + delta =3D 5000; + } + + next_period =3D prev_period + delta; + if (next_period + delta >=3D max_period) { + next_period =3D max_period; + } + return next_period; +} + static void calculate_dirtyrate_sample_vm(struct DirtyRateConfig config) { struct RamblockDirtyInfo *block_dinfo =3D NULL; int block_count =3D 0; - int64_t msec =3D 0; int64_t initial_time; + int64_t current_time; =20 + /* first pass */ rcu_read_lock(); initial_time =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); if (!record_ramblock_hash_info(&block_dinfo, config, &block_count)) { @@ -628,20 +719,34 @@ static void calculate_dirtyrate_sample_vm(struct Dirt= yRateConfig config) } rcu_read_unlock(); =20 - msec =3D config.sample_period_seconds * 1000; - msec =3D dirty_stat_wait(msec, initial_time); - DirtyStat.start_time =3D initial_time / 1000; - DirtyStat.calc_time =3D msec / 1000; + int64_t period =3D INITIAL_PERIOD_MS; + while (true) { + current_time =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + int64_t delta =3D initial_time + period - current_time; + if (delta > 0) { + g_usleep(delta * 1000); + } =20 - rcu_read_lock(); - if (!compare_page_hash_info(block_dinfo, block_count)) { - goto out; - } + rcu_read_lock(); + current_time =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + int64_t n_dirty =3D compare_page_hash_info(block_dinfo, block_coun= t); + rcu_read_unlock(); =20 - update_dirtyrate(msec); + SampleVMStat *ps =3D &DirtyStat.page_sampling; + ps->readings[ps->n_readings].period =3D current_time - initial_tim= e; + ps->readings[ps->n_readings].n_dirty_pages =3D n_dirty; + ps->n_readings++; + + if (period >=3D DirtyStat.calc_time * 1000) { + int64_t mb_total =3D (ps->n_total_pages * TARGET_PAGE_SIZE) >>= 20; + int64_t mb_dirty =3D n_dirty * mb_total / ps->n_sampled_pages; + DirtyStat.dirty_rate =3D mb_dirty * 1000 / period; + break; + } + period =3D increase_period(period, DirtyStat.calc_time * 1000); + } =20 out: - rcu_read_unlock(); free_ramblock_dirty_info(block_dinfo, block_count); } =20 @@ -804,11 +909,29 @@ void hmp_info_dirty_rate(Monitor *mon, const QDict *q= dict) rate->value->dirty_rate); } } + } else { monitor_printf(mon, "(not ready)\n"); } =20 + if (info->has_n_total_pages) { + monitor_printf(mon, "Page count (page size %d):\n", TARGET_PAGE_SI= ZE); + monitor_printf(mon, " Total: %"PRIi64"\n", info->n_total_pages); + monitor_printf(mon, " Sampled: %"PRIi64"\n", info->n_sampled_page= s); + monitor_printf(mon, " Zero: %"PRIi64"\n", info->n_zero_pages); + int64List *periods =3D info->periods; + int64List *n_dirty_pages =3D info->n_dirty_pages; + while (periods) { + monitor_printf(mon, " Dirty(%"PRIi64"ms): %"PRIi64"\n", + periods->value, n_dirty_pages->value); + periods =3D periods->next; + n_dirty_pages =3D n_dirty_pages->next; + } + } + qapi_free_DirtyRateVcpuList(info->vcpu_dirty_rate); + qapi_free_int64List(info->periods); + qapi_free_int64List(info->n_dirty_pages); g_free(info); } =20 diff --git a/migration/dirtyrate.h b/migration/dirtyrate.h index 594a5c0bb6..e2af72fb8c 100644 --- a/migration/dirtyrate.h +++ b/migration/dirtyrate.h @@ -42,6 +42,18 @@ #define MIN_SAMPLE_PAGE_COUNT 128 #define MAX_SAMPLE_PAGE_COUNT 16384 =20 +/* + * Initial sampling period expressed in milliseconds + */ +#define INITIAL_PERIOD_MS 125 + +/* + * Upper bound on the number of DirtyReadings calculcated based on + * INITIAL_PERIOD_MS, MAX_FETCH_DIRTYRATE_TIME_SEC and increase_period() + */ +#define MAX_DIRTY_READINGS 32 + + struct DirtyRateConfig { uint64_t sample_pages_per_gigabytes; /* sample pages per GB */ int64_t sample_period_seconds; /* time duration between two sampling */ @@ -57,14 +69,20 @@ struct RamblockDirtyInfo { uint64_t ramblock_pages; /* ramblock size in TARGET_PAGE_SIZE */ uint64_t *sample_page_vfn; /* relative offset address for sampled page= */ uint64_t sample_pages_count; /* count of sampled pages */ - uint64_t sample_dirty_count; /* count of dirty pages we measure */ uint32_t *hash_result; /* array of hash result for sampled pages */ }; =20 +typedef struct DirtyReading { + int64_t period; /* time period in milliseconds */ + int64_t n_dirty_pages; /* number of observed dirty pages */ +} DirtyReading; + typedef struct SampleVMStat { - uint64_t total_dirty_samples; /* total dirty sampled page */ - uint64_t total_sample_count; /* total sampled pages */ - uint64_t total_block_mem_MB; /* size of total sampled pages in MB */ + int64_t n_total_pages; /* total number of pages */ + int64_t n_sampled_pages; /* number of sampled pages */ + int64_t n_zero_pages; /* number of observed zero pages */ + int64_t n_readings; + DirtyReading *readings; } SampleVMStat; =20 /* diff --git a/qapi/migration.json b/qapi/migration.json index c84fa10e86..1a1d7cb30a 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1830,6 +1830,25 @@ # @mode: mode containing method of calculate dirtyrate includes # 'page-sampling' and 'dirty-ring' (Since 6.2) # +# @page-size: page size in bytes +# +# @n-total-pages: total number of VM pages +# +# @n-sampled-pages: number of sampled pages +# +# @n-zero-pages: number of observed zero pages among all sampled pages. +# Normally all pages are zero when VM starts, but +# their number progressively goes down as VM fills more +# and more memory with useful data. +# Migration of zero pages is optimized: only their headers +# are copied but not the (zero) data. +# +# @periods: array of time periods expressed in milliseconds for which +# dirty-sample measurements are collected +# +# @n-dirty-pages: number of pages among all sampled pages that were observ= ed +# as changed after respective time period +# # @vcpu-dirty-rate: dirtyrate for each vcpu if dirty-ring # mode specified (Since 6.2) # @@ -1842,6 +1861,12 @@ 'calc-time': 'int64', 'sample-pages': 'uint64', 'mode': 'DirtyRateMeasureMode', + 'page-size': 'int64', + '*n-total-pages': 'int64', + '*n-sampled-pages': 'int64', + '*n-zero-pages': 'int64', + '*periods': ['int64'], + '*n-dirty-pages': ['int64'], '*vcpu-dirty-rate': [ 'DirtyRateVcpu' ] } } =20 ## --=20 2.30.2