From nobody Tue May 21 10:10:14 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1688029208; cv=none; d=zohomail.com; s=zohoarc; b=dXzjxKOXsEgicrvvtoJSJ8qexE2l8JDBoWClZAtnnCPHwfMLLnE5nWYOsShXhuYw2iIq0Dd663b7UFynctWEZsE3ug6msSlM2SQpgvSpnXTSHtJImgyXMz64uTS4gVpaH34WpGNNVoWPNlAX796pNfZCJMLpt22OsZzL56v28fA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1688029208; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:Sender:Subject:To; bh=tJg6sO5LIUuluIbztVjO2+ZPWQ1GXSgHbNstbfOzvRQ=; b=GgXAkPSE1gAnHOBS1BHEOjzbQtOrZ+CB2r4litdjOXGXUVICOAE94+O7scv06vZgjh0uIpfRbKCcmH11LaD7pd1BXnxQLBXLXP0N0OnZDWltg4mpFUTNp22026gh91QApORoeGnXYaba4Appdjn7U1RIVnNSWcq4zh+fLHGu26M= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1688029208416840.8484781493974; Thu, 29 Jun 2023 02:00:08 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qEnVF-00083t-Td; Thu, 29 Jun 2023 04:59:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qEnVD-0007zy-PI for qemu-devel@nongnu.org; Thu, 29 Jun 2023 04:59:47 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qEnV9-0006Hh-DU for qemu-devel@nongnu.org; Thu, 29 Jun 2023 04:59:47 -0400 Received: from lhrpeml500004.china.huawei.com (unknown [172.18.147.207]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4QsC3r3rgYz6D8q1; Thu, 29 Jun 2023 16:56:24 +0800 (CST) Received: from DESKTOP-0LHM7NF.huawei.com (10.199.58.101) by lhrpeml500004.china.huawei.com (7.191.163.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Thu, 29 Jun 2023 09:59:24 +0100 To: CC: , , , , , Andrei Gudkov Subject: [PATCH] migration/calc-dirty-rate: millisecond precision period Date: Thu, 29 Jun 2023 11:59:03 +0300 Message-ID: <8571da37847f9bb39b84e62ef4998e68ef3c10d1.1688028297.git.gudkov.andrei@huawei.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.199.58.101] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To lhrpeml500004.china.huawei.com (7.191.163.9) X-CFilter-Loop: Reflected Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=185.176.79.56; envelope-from=gudkov.andrei@huawei.com; helo=frasgout.his.huawei.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Andrei Gudkov From: Andrei Gudkov via Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1688029210197100003 Content-Type: text/plain; charset="utf-8" Introduces alternative argument calc-time-ms, which is the the same as calc-time but accepts millisecond value. Millisecond precision allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time-ms set to max allowed downtime, convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 24GiB region: +--------------+--------------------+ | calc-time-ms | dirty-rate (MiB/s) | +--------------+--------------------+ | 100 | 1880 | | 200 | 1340 | | 300 | 1120 | | 400 | 1030 | | 500 | 868 | | 750 | 720 | | 1000 | 636 | | 1500 | 498 | | 2000 | 423 | +--------------+--------------------+ Signed-off-by: Andrei Gudkov Acked-by: Peter Xu --- qapi/migration.json | 15 ++++++-- migration/dirtyrate.h | 12 ++++--- migration/dirtyrate.c | 81 +++++++++++++++++++++++++------------------ 3 files changed, 68 insertions(+), 40 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 5bb5ab82a0..dd1afe1982 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1778,7 +1778,12 @@ # # @start-time: start time in units of second for calculation # -# @calc-time: time in units of second for sample dirty pages +# @calc-time: time period for which dirty page rate was measured +# (rounded down to seconds). +# +# @calc-time-ms: actual time period for which dirty page rate was +# measured (in milliseconds). Value may be larger than requested +# time period due to measurement overhead. # # @sample-pages: page count per GB for sample dirty pages the default # value is 512 (since 6.1) @@ -1796,6 +1801,7 @@ 'status': 'DirtyRateStatus', 'start-time': 'int64', 'calc-time': 'int64', + 'calc-time-ms': 'int64', 'sample-pages': 'uint64', 'mode': 'DirtyRateMeasureMode', '*vcpu-dirty-rate': [ 'DirtyRateVcpu' ] } } @@ -1807,6 +1813,10 @@ # # @calc-time: time in units of second for sample dirty pages # +# @calc-time-ms: the same as @calc-time but in milliseconds. These +# two arguments are mutually exclusive. Exactly one of them must +# be specified. (Since 8.1) +# # @sample-pages: page count per GB for sample dirty pages the default # value is 512 (since 6.1) # @@ -1821,7 +1831,8 @@ # 'sample-pages': 512} } # <- { "return": {} } ## -{ 'command': 'calc-dirty-rate', 'data': {'calc-time': 'int64', +{ 'command': 'calc-dirty-rate', 'data': {'*calc-time': 'int64', + '*calc-time-ms': 'int64', '*sample-pages': 'int', '*mode': 'DirtyRateMeasureMode'} } =20 diff --git a/migration/dirtyrate.h b/migration/dirtyrate.h index 594a5c0bb6..869c060941 100644 --- a/migration/dirtyrate.h +++ b/migration/dirtyrate.h @@ -31,10 +31,12 @@ #define MIN_RAMBLOCK_SIZE 128 =20 /* - * Take 1s as minimum time for calculation duration + * Allowed range for dirty page rate calculation (in milliseconds). + * Lower limit relates to the smallest realistic downtime it + * makes sense to impose on migration. */ -#define MIN_FETCH_DIRTYRATE_TIME_SEC 1 -#define MAX_FETCH_DIRTYRATE_TIME_SEC 60 +#define MIN_CALC_TIME_MS 50 +#define MAX_CALC_TIME_MS 60000 =20 /* * Take 1/16 pages in 1G as the maxmum sample page count @@ -44,7 +46,7 @@ =20 struct DirtyRateConfig { uint64_t sample_pages_per_gigabytes; /* sample pages per GB */ - int64_t sample_period_seconds; /* time duration between two sampling */ + int64_t calc_time_ms; /* desired calculation time (in milliseconds) */ DirtyRateMeasureMode mode; /* mode of dirtyrate measurement */ }; =20 @@ -73,7 +75,7 @@ typedef struct SampleVMStat { struct DirtyRateStat { int64_t dirty_rate; /* dirty rate in MB/s */ int64_t start_time; /* calculation start time in units of second */ - int64_t calc_time; /* time duration of two sampling in units of second= */ + int64_t calc_time_ms; /* actual calculation time (in milliseconds) */ uint64_t sample_pages; /* sample pages per GB */ union { SampleVMStat page_sampling; diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c index 84f1b0fb20..90fb336329 100644 --- a/migration/dirtyrate.c +++ b/migration/dirtyrate.c @@ -57,6 +57,8 @@ static int64_t dirty_stat_wait(int64_t msec, int64_t init= ial_time) msec =3D current_time - initial_time; } else { g_usleep((msec + initial_time - current_time) * 1000); + /* g_usleep may overshoot */ + msec =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - initial_time; } =20 return msec; @@ -77,9 +79,12 @@ static int64_t do_calculate_dirtyrate(DirtyPageRecord di= rty_pages, { uint64_t increased_dirty_pages =3D dirty_pages.end_pages - dirty_pages.start_pages; - uint64_t memory_size_MiB =3D qemu_target_pages_to_MiB(increased_dirty_= pages); - - return memory_size_MiB * 1000 / calc_time_ms; + /* + * multiply by 1000ms/s _before_ converting down to megabytes + * to avoid losing precision + */ + return qemu_target_pages_to_MiB(increased_dirty_pages * 1000) / + calc_time_ms; } =20 void global_dirty_log_change(unsigned int flag, bool start) @@ -183,10 +188,9 @@ retry: return duration; } =20 -static bool is_sample_period_valid(int64_t sec) +static bool is_calc_time_valid(int64_t msec) { - if (sec < MIN_FETCH_DIRTYRATE_TIME_SEC || - sec > MAX_FETCH_DIRTYRATE_TIME_SEC) { + if ((msec < MIN_CALC_TIME_MS) || (msec > MAX_CALC_TIME_MS)) { return false; } =20 @@ -219,7 +223,8 @@ static struct DirtyRateInfo *query_dirty_rate_info(void) =20 info->status =3D CalculatingState; info->start_time =3D DirtyStat.start_time; - info->calc_time =3D DirtyStat.calc_time; + info->calc_time_ms =3D DirtyStat.calc_time_ms; + info->calc_time =3D DirtyStat.calc_time_ms / 1000; info->sample_pages =3D DirtyStat.sample_pages; info->mode =3D dirtyrate_mode; =20 @@ -258,7 +263,7 @@ static void init_dirtyrate_stat(int64_t start_time, { DirtyStat.dirty_rate =3D -1; DirtyStat.start_time =3D start_time; - DirtyStat.calc_time =3D config.sample_period_seconds; + DirtyStat.calc_time_ms =3D config.calc_time_ms; DirtyStat.sample_pages =3D config.sample_pages_per_gigabytes; =20 switch (config.mode) { @@ -568,7 +573,6 @@ static inline void dirtyrate_manual_reset_protect(void) =20 static void calculate_dirtyrate_dirty_bitmap(struct DirtyRateConfig config) { - int64_t msec =3D 0; int64_t start_time; DirtyPageRecord dirty_pages; =20 @@ -596,9 +600,7 @@ static void calculate_dirtyrate_dirty_bitmap(struct Dir= tyRateConfig config) start_time =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); DirtyStat.start_time =3D start_time / 1000; =20 - msec =3D config.sample_period_seconds * 1000; - msec =3D dirty_stat_wait(msec, start_time); - DirtyStat.calc_time =3D msec / 1000; + DirtyStat.calc_time_ms =3D dirty_stat_wait(config.calc_time_ms, start_= time); =20 /* * do two things. @@ -609,12 +611,12 @@ static void calculate_dirtyrate_dirty_bitmap(struct D= irtyRateConfig config) =20 record_dirtypages_bitmap(&dirty_pages, false); =20 - DirtyStat.dirty_rate =3D do_calculate_dirtyrate(dirty_pages, msec); + DirtyStat.dirty_rate =3D do_calculate_dirtyrate(dirty_pages, + DirtyStat.calc_time_ms); } =20 static void calculate_dirtyrate_dirty_ring(struct DirtyRateConfig config) { - int64_t duration; uint64_t dirtyrate =3D 0; uint64_t dirtyrate_sum =3D 0; int i =3D 0; @@ -625,12 +627,10 @@ static void calculate_dirtyrate_dirty_ring(struct Dir= tyRateConfig config) DirtyStat.start_time =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000; =20 /* calculate vcpu dirtyrate */ - duration =3D vcpu_calculate_dirtyrate(config.sample_period_seconds * 1= 000, - &DirtyStat.dirty_ring, - GLOBAL_DIRTY_DIRTY_RATE, - true); - - DirtyStat.calc_time =3D duration / 1000; + DirtyStat.calc_time_ms =3D vcpu_calculate_dirtyrate(config.calc_time_m= s, + &DirtyStat.dirty_rin= g, + GLOBAL_DIRTY_DIRTY_R= ATE, + true); =20 /* calculate vm dirtyrate */ for (i =3D 0; i < DirtyStat.dirty_ring.nvcpu; i++) { @@ -646,7 +646,6 @@ static void calculate_dirtyrate_sample_vm(struct DirtyR= ateConfig config) { struct RamblockDirtyInfo *block_dinfo =3D NULL; int block_count =3D 0; - int64_t msec =3D 0; int64_t initial_time; =20 rcu_read_lock(); @@ -656,17 +655,16 @@ static void calculate_dirtyrate_sample_vm(struct Dirt= yRateConfig config) } rcu_read_unlock(); =20 - msec =3D config.sample_period_seconds * 1000; - msec =3D dirty_stat_wait(msec, initial_time); + DirtyStat.calc_time_ms =3D dirty_stat_wait(config.calc_time_ms, + initial_time); DirtyStat.start_time =3D initial_time / 1000; - DirtyStat.calc_time =3D msec / 1000; =20 rcu_read_lock(); if (!compare_page_hash_info(block_dinfo, block_count)) { goto out; } =20 - update_dirtyrate(msec); + update_dirtyrate(DirtyStat.calc_time_ms); =20 out: rcu_read_unlock(); @@ -711,7 +709,10 @@ void *get_dirtyrate_thread(void *arg) return NULL; } =20 -void qmp_calc_dirty_rate(int64_t calc_time, +void qmp_calc_dirty_rate(bool has_calc_time, + int64_t calc_time, + bool has_calc_time_ms, + int64_t calc_time_ms, bool has_sample_pages, int64_t sample_pages, bool has_mode, @@ -731,10 +732,21 @@ void qmp_calc_dirty_rate(int64_t calc_time, return; } =20 - if (!is_sample_period_valid(calc_time)) { - error_setg(errp, "calc-time is out of range[%d, %d].", - MIN_FETCH_DIRTYRATE_TIME_SEC, - MAX_FETCH_DIRTYRATE_TIME_SEC); + if ((int)has_calc_time + (int)has_calc_time_ms !=3D 1) { + error_setg(errp, "Exactly one of calc-time and calc-time-ms must" + " be specified"); + return; + } + if (has_calc_time) { + /* + * The worst thing that can happen due to overflow is that + * invalid value will become valid. + */ + calc_time_ms =3D calc_time * 1000; + } + if (!is_calc_time_valid(calc_time_ms)) { + error_setg(errp, "Calculation time is out of range[%dms, %dms].", + MIN_CALC_TIME_MS, MAX_CALC_TIME_MS); return; } =20 @@ -781,7 +793,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, return; } =20 - config.sample_period_seconds =3D calc_time; + config.calc_time_ms =3D calc_time_ms; config.sample_pages_per_gigabytes =3D sample_pages; config.mode =3D mode; =20 @@ -867,8 +879,11 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qd= ict) mode =3D DIRTY_RATE_MEASURE_MODE_DIRTY_RING; } =20 - qmp_calc_dirty_rate(sec, has_sample_pages, sample_pages, true, - mode, &err); + qmp_calc_dirty_rate(true, sec, /* calc_time */ + false, 0, /* calc_time_ms */ + has_sample_pages, sample_pages, + true, mode, + &err); if (err) { hmp_handle_error(mon, err); return; --=20 2.30.2