From nobody Sun Feb 8 14:40:39 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1648005660663546.0953632751739; Tue, 22 Mar 2022 20:21:00 -0700 (PDT) Received: from localhost ([::1]:48466 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWrYR-0004bJ-L7 for importer@patchew.org; Tue, 22 Mar 2022 23:20:59 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50472) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nWrWY-0002Xn-PJ for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:02 -0400 Received: from prt-mail.chinatelecom.cn ([42.123.76.222]:48492 helo=chinatelecom.cn) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWrWV-0006fv-Te for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:02 -0400 Received: from clientip-36.111.64.85 (unknown [172.18.0.218]) by chinatelecom.cn (HERMES) with SMTP id 5B1D7280460; Wed, 23 Mar 2022 11:18:48 +0800 (CST) Received: from ([172.18.0.218]) by app0025 with ESMTP id 68aa4aa3754d4b0a86228c144d2b23a8 for qemu-devel@nongnu.org; Wed, 23 Mar 2022 11:18:55 CST HMM_SOURCE_IP: 172.18.0.218:52666.2058481772 HMM_ATTACHE_NUM: 0000 HMM_SOURCE_TYPE: SMTP X-189-SAVE-TO-SEND: +wucy11@chinatelecom.cn X-Transaction-ID: 68aa4aa3754d4b0a86228c144d2b23a8 X-Real-From: wucy11@chinatelecom.cn X-Receive-IP: 172.18.0.218 X-MEDUSA-Status: 0 From: wucy11@chinatelecom.cn To: qemu-devel@nongnu.org Subject: [PATCH v1 1/5] kvm, memory: Optimize dirty page collection for dirty ring Date: Wed, 23 Mar 2022 11:18:34 +0800 Message-Id: <1c2c18ab43ec43959c3464d216e6a3144b802a53.1648002360.git.wucy11@chinatelecom.cn> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=42.123.76.222; envelope-from=wucy11@chinatelecom.cn; helo=chinatelecom.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: baiyw2@chinatelecom.cn, yuanmh12@chinatelecom.cn, tugy@chinatelecom.cn, David Hildenbrand , huangy81@chinatelecom.cn, Juan Quintela , Richard Henderson , "Dr. David Alan Gilbert" , Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , yubin1@chinatelecom.cn, dengpc12@chinatelecom.cn, Paolo Bonzini , wucy11@chinatelecom.cn Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1648005665742100001 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Chongyun Wu When log_sync_global of dirty ring is called, it will collect dirty pages on all cpus, including all dirty pages on memslot, so when memory_region_sync_dirty_bitmap collects dirty pages from KVM, this interface needs to be called once, instead of traversing every dirty page. Each memslot is called once, which is meaningless and time-consuming. So only need to call log_sync_global once in memory_region_sync_dirty_bitmap. Signed-off-by: Chongyun Wu --- softmmu/memory.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/softmmu/memory.c b/softmmu/memory.c index 8060c6d..46c3ff9 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -2184,6 +2184,12 @@ static void memory_region_sync_dirty_bitmap(MemoryRe= gion *mr) */ listener->log_sync_global(listener); trace_memory_region_sync_dirty(mr ? mr->name : "(all)", listen= er->name, 1); + /* + * The log_sync_global of the dirty ring will collect the dirty + * pages of all memslots at one time, so there is no need to + * call log_sync_global once when traversing each memslot. + */ + break; } } } --=20 1.8.3.1 From nobody Sun Feb 8 14:40:39 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1648005945317457.8218858296241; Tue, 22 Mar 2022 20:25:45 -0700 (PDT) Received: from localhost ([::1]:35232 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWrd2-0006HB-Oz for importer@patchew.org; Tue, 22 Mar 2022 23:25:45 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50502) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nWrWa-0002bM-C8 for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:04 -0400 Received: from prt-mail.chinatelecom.cn ([42.123.76.222]:48550 helo=chinatelecom.cn) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWrWY-0006g6-4F for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:04 -0400 Received: from clientip-36.111.64.85 (unknown [172.18.0.218]) by chinatelecom.cn (HERMES) with SMTP id CEC1D28046D; Wed, 23 Mar 2022 11:18:55 +0800 (CST) Received: from ([172.18.0.218]) by app0025 with ESMTP id f841c27cea0042d8aac437c8432f9b1e for qemu-devel@nongnu.org; Wed, 23 Mar 2022 11:19:00 CST HMM_SOURCE_IP: 172.18.0.218:52666.2058481772 HMM_ATTACHE_NUM: 0000 HMM_SOURCE_TYPE: SMTP X-189-SAVE-TO-SEND: +wucy11@chinatelecom.cn X-Transaction-ID: f841c27cea0042d8aac437c8432f9b1e X-Real-From: wucy11@chinatelecom.cn X-Receive-IP: 172.18.0.218 X-MEDUSA-Status: 0 From: wucy11@chinatelecom.cn To: qemu-devel@nongnu.org Subject: [PATCH v1 2/5] kvm: Dynamically adjust the rate of dirty ring reaper thread Date: Wed, 23 Mar 2022 11:18:35 +0800 Message-Id: <34f1a7ab114dc52aaae4551b6ef12d88437df7df.1648002360.git.wucy11@chinatelecom.cn> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=42.123.76.222; envelope-from=wucy11@chinatelecom.cn; helo=chinatelecom.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: baiyw2@chinatelecom.cn, yuanmh12@chinatelecom.cn, tugy@chinatelecom.cn, David Hildenbrand , huangy81@chinatelecom.cn, Juan Quintela , Richard Henderson , "Dr. David Alan Gilbert" , Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , yubin1@chinatelecom.cn, dengpc12@chinatelecom.cn, Paolo Bonzini , wucy11@chinatelecom.cn Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1648005957332100001 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Chongyun Wu Dynamically adjust the dirty ring collection thread to reduce the occurrence of ring full, thereby reducing the impact on customers, improving the efficiency of dirty page collection, and thus improving the migration efficiency. Implementation: 1) Define different collection speeds for the reap thread. 2) Divide the total number of dirty pages collected each time by the ring size to get a ratio which indicates the occupancy rate of dirty pages in the ring. The higher the ratio, the higher the possibility that the ring will be full. 3) Different ratios correspond to different running speeds. A higher ratio value indicates that a higher running speed is required to collect dirty pages as soon as possible to ensure that too many ring fulls will not be generated, which will affect the customer's business. This patch can significantly reduce the number of ring full occurrences in the case of high memory dirty page pressure, and minimize the impact on guests. Using this patch for the qeum guestperf test, the memory performance during the migration process is somewhat improved compared to the bitmap method, and is significantly improved compared to the unoptimized dirty ring method. For detailed test data, please refer to the follow-up series of patches. Signed-off-by: Chongyun Wu --- accel/kvm/kvm-all.c | 149 ++++++++++++++++++++++++++++++++++++++++++++++++= ++-- 1 file changed, 144 insertions(+), 5 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 27864df..65a4de8 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -91,6 +91,27 @@ enum KVMDirtyRingReaperState { KVM_DIRTY_RING_REAPER_REAPING, }; =20 +enum KVMDirtyRingReaperRunLevel { + /* The reaper runs at default normal speed */ + KVM_DIRTY_RING_REAPER_RUN_NORMAL =3D 0, + /* The reaper starts to accelerate in different gears */ + KVM_DIRTY_RING_REAPER_RUN_FAST1, + KVM_DIRTY_RING_REAPER_RUN_FAST2, + KVM_DIRTY_RING_REAPER_RUN_FAST3, + KVM_DIRTY_RING_REAPER_RUN_FAST4, + /* The reaper runs at the fastest speed */ + KVM_DIRTY_RING_REAPER_RUN_MAX_SPEED, +}; + +enum KVMDirtyRingReaperSpeedControl { + /* Maintain current speed */ + KVM_DIRTY_RING_REAPER_SPEED_CONTROL_KEEP =3D 0, + /* Accelerate current speed */ + KVM_DIRTY_RING_REAPER_SPEED_CONTROL_UP, + /* Decrease current speed */ + KVM_DIRTY_RING_REAPER_SPEED_CONTROL_DOWN +}; + /* * KVM reaper instance, responsible for collecting the KVM dirty bits * via the dirty ring. @@ -100,6 +121,11 @@ struct KVMDirtyRingReaper { QemuThread reaper_thr; volatile uint64_t reaper_iteration; /* iteration number of reaper thr = */ volatile enum KVMDirtyRingReaperState reaper_state; /* reap thr state = */ + /* Control the running speed of the reaper thread to fit dirty page ra= te */ + enum KVMDirtyRingReaperRunLevel run_level; + uint64_t ring_full_cnt; + float ratio_adjust_threshold; + int stable_count_threshold; }; =20 struct KVMState @@ -1449,11 +1475,115 @@ out: kvm_slots_unlock(); } =20 +static uint64_t calcu_sleep_time(KVMState *s, + uint64_t dirty_count, + uint64_t ring_full_cnt_last, + uint32_t *speed_down_cnt) +{ + float ratio =3D 0.0; + uint64_t sleep_time =3D 1000000; + enum KVMDirtyRingReaperRunLevel run_level_want; + enum KVMDirtyRingReaperSpeedControl speed_control; + + /* + * When the number of dirty pages collected exceeds + * the given percentage of the ring size,the speed + * up action will be triggered. + */ + s->reaper.ratio_adjust_threshold =3D 0.1; + s->reaper.stable_count_threshold =3D 5; + + ratio =3D (float)dirty_count / s->kvm_dirty_ring_size; + + if (s->reaper.ring_full_cnt > ring_full_cnt_last) { + /* If get a new ring full need speed up reaper thread */ + if (s->reaper.run_level !=3D KVM_DIRTY_RING_REAPER_RUN_MAX_SPEED) { + s->reaper.run_level++; + } + } else { + /* + * If get more dirty pages this loop and this status continus + * for many times try to speed up reaper thread. + * If the status is stable and need to decide which speed need + * to use. + */ + if (ratio < s->reaper.ratio_adjust_threshold) { + run_level_want =3D KVM_DIRTY_RING_REAPER_RUN_NORMAL; + } else if (ratio < s->reaper.ratio_adjust_threshold * 2) { + run_level_want =3D KVM_DIRTY_RING_REAPER_RUN_FAST1; + } else if (ratio < s->reaper.ratio_adjust_threshold * 3) { + run_level_want =3D KVM_DIRTY_RING_REAPER_RUN_FAST2; + } else if (ratio < s->reaper.ratio_adjust_threshold * 4) { + run_level_want =3D KVM_DIRTY_RING_REAPER_RUN_FAST3; + } else if (ratio < s->reaper.ratio_adjust_threshold * 5) { + run_level_want =3D KVM_DIRTY_RING_REAPER_RUN_FAST4; + } else { + run_level_want =3D KVM_DIRTY_RING_REAPER_RUN_MAX_SPEED; + } + + /* Get if need speed up or slow down */ + if (run_level_want > s->reaper.run_level) { + speed_control =3D KVM_DIRTY_RING_REAPER_SPEED_CONTROL_UP; + *speed_down_cnt =3D 0; + } else if (run_level_want < s->reaper.run_level) { + speed_control =3D KVM_DIRTY_RING_REAPER_SPEED_CONTROL_DOWN; + *speed_down_cnt++; + } else { + speed_control =3D KVM_DIRTY_RING_REAPER_SPEED_CONTROL_KEEP; + } + + /* Control reaper thread run in sutiable run speed level */ + if (speed_control =3D=3D KVM_DIRTY_RING_REAPER_SPEED_CONTROL_UP) { + /* If need speed up do not check its stable just do it */ + s->reaper.run_level++; + } else if (speed_control =3D=3D + KVM_DIRTY_RING_REAPER_SPEED_CONTROL_DOWN) { + /* If need speed down we should filter this status */ + if (*speed_down_cnt > s->reaper.stable_count_threshold) { + s->reaper.run_level--; + } + } + } + + /* Set the actual running rate of the reaper */ + switch (s->reaper.run_level) { + case KVM_DIRTY_RING_REAPER_RUN_NORMAL: + sleep_time =3D 1000000; + break; + case KVM_DIRTY_RING_REAPER_RUN_FAST1: + sleep_time =3D 500000; + break; + case KVM_DIRTY_RING_REAPER_RUN_FAST2: + sleep_time =3D 250000; + break; + case KVM_DIRTY_RING_REAPER_RUN_FAST3: + sleep_time =3D 125000; + break; + case KVM_DIRTY_RING_REAPER_RUN_FAST4: + sleep_time =3D 100000; + break; + case KVM_DIRTY_RING_REAPER_RUN_MAX_SPEED: + sleep_time =3D 80000; + break; + default: + sleep_time =3D 1000000; + error_report("Bad reaper thread run level, use default"); + } + + return sleep_time; +} + static void *kvm_dirty_ring_reaper_thread(void *data) { KVMState *s =3D data; struct KVMDirtyRingReaper *r =3D &s->reaper; =20 + uint64_t count =3D 0; + uint64_t sleep_time =3D 1000000; + uint64_t ring_full_cnt_last =3D 0; + /* Filter speed jitter */ + uint32_t speed_down_cnt =3D 0; + rcu_register_thread(); =20 trace_kvm_dirty_ring_reaper("init"); @@ -1461,18 +1591,26 @@ static void *kvm_dirty_ring_reaper_thread(void *dat= a) while (true) { r->reaper_state =3D KVM_DIRTY_RING_REAPER_WAIT; trace_kvm_dirty_ring_reaper("wait"); - /* - * TODO: provide a smarter timeout rather than a constant? - */ - sleep(1); + + ring_full_cnt_last =3D s->reaper.ring_full_cnt; + + usleep(sleep_time); =20 trace_kvm_dirty_ring_reaper("wakeup"); r->reaper_state =3D KVM_DIRTY_RING_REAPER_REAPING; =20 qemu_mutex_lock_iothread(); - kvm_dirty_ring_reap(s); + count =3D kvm_dirty_ring_reap(s); qemu_mutex_unlock_iothread(); =20 + /* + * Calculate the appropriate sleep time according to + * the speed of the current dirty page. + */ + sleep_time =3D calcu_sleep_time(s, count, + ring_full_cnt_last, + &speed_down_cnt); + r->reaper_iteration++; } =20 @@ -2958,6 +3096,7 @@ int kvm_cpu_exec(CPUState *cpu) trace_kvm_dirty_ring_full(cpu->cpu_index); qemu_mutex_lock_iothread(); kvm_dirty_ring_reap(kvm_state); + kvm_state->reaper.ring_full_cnt++; qemu_mutex_unlock_iothread(); ret =3D 0; break; --=20 1.8.3.1 From nobody Sun Feb 8 14:40:39 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1648005988270374.21144559604363; Tue, 22 Mar 2022 20:26:28 -0700 (PDT) Received: from localhost ([::1]:36606 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWrdj-0007HQ-B5 for importer@patchew.org; Tue, 22 Mar 2022 23:26:27 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50516) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nWrWg-0002i7-GO for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:10 -0400 Received: from prt-mail.chinatelecom.cn ([42.123.76.222]:48618 helo=chinatelecom.cn) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWrWe-0006h6-F4 for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:10 -0400 Received: from clientip-36.111.64.85 (unknown [172.18.0.218]) by chinatelecom.cn (HERMES) with SMTP id 3AD7228047A; Wed, 23 Mar 2022 11:19:01 +0800 (CST) Received: from ([172.18.0.218]) by app0025 with ESMTP id ce0e06a26c37436f838f61d4d3a3b9e0 for qemu-devel@nongnu.org; Wed, 23 Mar 2022 11:19:06 CST HMM_SOURCE_IP: 172.18.0.218:52666.2058481772 HMM_ATTACHE_NUM: 0000 HMM_SOURCE_TYPE: SMTP X-189-SAVE-TO-SEND: +wucy11@chinatelecom.cn X-Transaction-ID: ce0e06a26c37436f838f61d4d3a3b9e0 X-Real-From: wucy11@chinatelecom.cn X-Receive-IP: 172.18.0.218 X-MEDUSA-Status: 0 From: wucy11@chinatelecom.cn To: qemu-devel@nongnu.org Subject: [PATCH v1 3/5] kvm: Dirty ring autoconverge optmization for kvm_cpu_synchronize_kick_all Date: Wed, 23 Mar 2022 11:18:36 +0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=42.123.76.222; envelope-from=wucy11@chinatelecom.cn; helo=chinatelecom.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: baiyw2@chinatelecom.cn, yuanmh12@chinatelecom.cn, tugy@chinatelecom.cn, David Hildenbrand , huangy81@chinatelecom.cn, Juan Quintela , Richard Henderson , "Dr. David Alan Gilbert" , Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , yubin1@chinatelecom.cn, dengpc12@chinatelecom.cn, Paolo Bonzini , wucy11@chinatelecom.cn Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1648005988667100001 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Chongyun Wu Dirty ring feature need call kvm_cpu_synchronize_kick_all to flush hardware buffers into KVMslots, but when aucoverge run kvm_cpu_synchronize_kick_all calling will become more and more time consuming. This will significantly reduce the efficiency of dirty page queries, especially when memory pressure is high and the speed limit is high. When the CPU speed limit is high and kvm_cpu_synchronize_kick_all is time-consuming, the rate of dirty pages generated by the VM will also be significantly reduced, so it is not necessary to call kvm_cpu_synchronize_kick_all at this time, just call it once before stopping the VM. This will significantly improve the efficiency of dirty page queries under high pressure. Signed-off-by: Chongyun Wu --- accel/kvm/kvm-all.c | 25 +++++-------------------- include/exec/cpu-common.h | 2 ++ migration/migration.c | 11 +++++++++++ softmmu/cpus.c | 18 ++++++++++++++++++ 4 files changed, 36 insertions(+), 20 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 65a4de8..5e02700 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -48,6 +48,8 @@ =20 #include "hw/boards.h" =20 +#include "sysemu/cpu-throttle.h" + /* This check must be after config-host.h is included */ #ifdef CONFIG_EVENTFD #include @@ -839,25 +841,6 @@ static uint64_t kvm_dirty_ring_reap(KVMState *s) return total; } =20 -static void do_kvm_cpu_synchronize_kick(CPUState *cpu, run_on_cpu_data arg) -{ - /* No need to do anything */ -} - -/* - * Kick all vcpus out in a synchronized way. When returned, we - * guarantee that every vcpu has been kicked and at least returned to - * userspace once. - */ -static void kvm_cpu_synchronize_kick_all(void) -{ - CPUState *cpu; - - CPU_FOREACH(cpu) { - run_on_cpu(cpu, do_kvm_cpu_synchronize_kick, RUN_ON_CPU_NULL); - } -} - /* * Flush all the existing dirty pages to the KVM slot buffers. When * this call returns, we guarantee that all the touched dirty pages @@ -879,7 +862,9 @@ static void kvm_dirty_ring_flush(void) * First make sure to flush the hardware buffers by kicking all * vcpus out in a synchronous way. */ - kvm_cpu_synchronize_kick_all(); + if (!cpu_throttle_get_percentage()) { + qemu_kvm_cpu_synchronize_kick_all(); + } kvm_dirty_ring_reap(kvm_state); trace_kvm_dirty_ring_flush(1); } diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 50a7d29..13045b3 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -160,4 +160,6 @@ extern int singlestep; =20 void list_cpus(const char *optarg); =20 +void qemu_kvm_cpu_synchronize_kick_all(void); + #endif /* CPU_COMMON_H */ diff --git a/migration/migration.c b/migration/migration.c index 695f0f2..ca1db88 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -61,6 +61,7 @@ #include "sysemu/cpus.h" #include "yank_functions.h" #include "sysemu/qtest.h" +#include "sysemu/kvm.h" =20 #define MAX_THROTTLE (128 << 20) /* Migration transfer speed throttl= ing */ =20 @@ -3183,6 +3184,16 @@ static void migration_completion(MigrationState *s) =20 if (!ret) { bool inactivate =3D !migrate_colo_enabled(); + /* + * Before stop vm do qemu_kvm_cpu_synchronize_kick_all to + * fulsh hardware buffer into KVMslots for dirty ring + * optmiaztion, If qemu_kvm_cpu_synchronize_kick_all is not + * called when the CPU speed is limited to improve efficiency + */ + if (kvm_dirty_ring_enabled() + && cpu_throttle_get_percentage()) { + qemu_kvm_cpu_synchronize_kick_all(); + } ret =3D vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); trace_migration_completion_vm_stop(ret); if (ret >=3D 0) { diff --git a/softmmu/cpus.c b/softmmu/cpus.c index 7b75bb6..d028d83 100644 --- a/softmmu/cpus.c +++ b/softmmu/cpus.c @@ -810,3 +810,21 @@ void qmp_inject_nmi(Error **errp) nmi_monitor_handle(monitor_get_cpu_index(monitor_cur()), errp); } =20 +static void do_kvm_cpu_synchronize_kick(CPUState *cpu, run_on_cpu_data arg) +{ + /* No need to do anything */ +} + +/* + * Kick all vcpus out in a synchronized way. When returned, we + * guarantee that every vcpu has been kicked and at least returned to + * userspace once. + */ +void qemu_kvm_cpu_synchronize_kick_all(void) +{ + CPUState *cpu; + + CPU_FOREACH(cpu) { + run_on_cpu(cpu, do_kvm_cpu_synchronize_kick, RUN_ON_CPU_NULL); + } +} --=20 1.8.3.1 From nobody Sun Feb 8 14:40:39 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1648006068000567.6720700872823; Tue, 22 Mar 2022 20:27:48 -0700 (PDT) Received: from localhost ([::1]:39558 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWrf1-0000t6-0a for importer@patchew.org; Tue, 22 Mar 2022 23:27:47 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50530) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nWrWk-0002xC-Sp for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:14 -0400 Received: from prt-mail.chinatelecom.cn ([42.123.76.222]:48659 helo=chinatelecom.cn) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWrWi-0006hP-ML for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:14 -0400 Received: from clientip-36.111.64.85 (unknown [172.18.0.218]) by chinatelecom.cn (HERMES) with SMTP id 6F516280485; Wed, 23 Mar 2022 11:19:06 +0800 (CST) Received: from ([172.18.0.218]) by app0025 with ESMTP id 95e67f6018bb47a69a4e4cd53159b7aa for qemu-devel@nongnu.org; Wed, 23 Mar 2022 11:19:11 CST HMM_SOURCE_IP: 172.18.0.218:52666.2058481772 HMM_ATTACHE_NUM: 0000 HMM_SOURCE_TYPE: SMTP X-189-SAVE-TO-SEND: +wucy11@chinatelecom.cn X-Transaction-ID: 95e67f6018bb47a69a4e4cd53159b7aa X-Real-From: wucy11@chinatelecom.cn X-Receive-IP: 172.18.0.218 X-MEDUSA-Status: 0 From: wucy11@chinatelecom.cn To: qemu-devel@nongnu.org Subject: [PATCH v1 4/5] kvm: Introduce a dirty rate calculation method based on dirty ring Date: Wed, 23 Mar 2022 11:18:37 +0800 Message-Id: <7593aa50f8595cf06df7675032ccbe877c1823f4.1648002360.git.wucy11@chinatelecom.cn> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=42.123.76.222; envelope-from=wucy11@chinatelecom.cn; helo=chinatelecom.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: baiyw2@chinatelecom.cn, yuanmh12@chinatelecom.cn, tugy@chinatelecom.cn, David Hildenbrand , huangy81@chinatelecom.cn, Juan Quintela , Richard Henderson , "Dr. David Alan Gilbert" , Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , yubin1@chinatelecom.cn, dengpc12@chinatelecom.cn, Paolo Bonzini , wucy11@chinatelecom.cn Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1648006069150100001 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Chongyun Wu A new structure KVMDirtyRingDirtyCounter is introduced in KVMDirtyRingReaper to record the number of dirty pages within a period of time. When kvm_dirty_ring_mark_page collects dirty pages, if it finds that the current dirty pages are not duplicates, it increases the dirty_pages_period count. Divide the dirty_pages_period count by the interval to get the dirty page rate for this period. And use dirty_pages_period_peak_rate to count the highest dirty page rate, to solve the problem that the dirty page collection rate may change greatly during a period of time, resulting in a large change in the dirty page rate. Through sufficient testing, it is found that the dirty rate calculated after kvm_dirty_ring_flush usually matches the actual pressure, and the dirty rate counted per second may change in the subsequent seconds, so record the peak dirty rate as the real dirty pages rate. This dirty pages rate is mainly used as the subsequent autoconverge calculation speed limit throttle. Signed-off-by: Chongyun Wu --- accel/kvm/kvm-all.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++= +++- include/sysemu/kvm.h | 2 ++ 2 files changed, 65 insertions(+), 1 deletion(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 5e02700..a158b1c 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -114,6 +114,13 @@ enum KVMDirtyRingReaperSpeedControl { KVM_DIRTY_RING_REAPER_SPEED_CONTROL_DOWN }; =20 +struct KVMDirtyRingDirtyCounter { + int64_t time_last_count; + uint64_t dirty_pages_period; + int64_t dirty_pages_rate; + int64_t dirty_pages_period_peak_rate; +}; + /* * KVM reaper instance, responsible for collecting the KVM dirty bits * via the dirty ring. @@ -128,6 +135,7 @@ struct KVMDirtyRingReaper { uint64_t ring_full_cnt; float ratio_adjust_threshold; int stable_count_threshold; + struct KVMDirtyRingDirtyCounter counter; /* Calculate dirty pages rate= */ }; =20 struct KVMState @@ -739,7 +747,9 @@ static void kvm_dirty_ring_mark_page(KVMState *s, uint3= 2_t as_id, return; } =20 - set_bit(offset, mem->dirty_bmap); + if (!test_and_set_bit(offset, mem->dirty_bmap)) { + s->reaper.counter.dirty_pages_period++; + } } =20 static bool dirty_gfn_is_dirtied(struct kvm_dirty_gfn *gfn) @@ -783,6 +793,56 @@ static uint32_t kvm_dirty_ring_reap_one(KVMState *s, C= PUState *cpu) return count; } =20 +int64_t kvm_dirty_ring_get_rate(void) +{ + return kvm_state->reaper.counter.dirty_pages_rate; +} + +int64_t kvm_dirty_ring_get_peak_rate(void) +{ + return kvm_state->reaper.counter.dirty_pages_period_peak_rate; +} + +static void kvm_dirty_ring_reap_count(KVMState *s) +{ + int64_t spend_time =3D 0; + int64_t end_time; + + if (!s->reaper.counter.time_last_count) { + s->reaper.counter.time_last_count =3D + qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + } + + end_time =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + spend_time =3D end_time - s->reaper.counter.time_last_count; + + if (!s->reaper.counter.dirty_pages_period || + !spend_time) { + return; + } + + /* + * More than 1 second =3D 1000 millisecons, + * or trigger by kvm_log_sync_global which spend time + * more than 300 milliscons. + */ + if (spend_time > 1000) { + /* Count the dirty page rate during this period */ + s->reaper.counter.dirty_pages_rate =3D + s->reaper.counter.dirty_pages_period * 1000 / spend_time; + /* Update the peak dirty page rate at this period */ + if (s->reaper.counter.dirty_pages_rate > + s->reaper.counter.dirty_pages_period_peak_rate) { + s->reaper.counter.dirty_pages_period_peak_rate =3D + s->reaper.counter.dirty_pages_rate; + } + + /* Reset counters */ + s->reaper.counter.dirty_pages_period =3D 0; + s->reaper.counter.time_last_count =3D 0; + } +} + /* Must be with slots_lock held */ static uint64_t kvm_dirty_ring_reap_locked(KVMState *s) { @@ -793,6 +853,8 @@ static uint64_t kvm_dirty_ring_reap_locked(KVMState *s) =20 stamp =3D get_clock(); =20 + kvm_dirty_ring_reap_count(s); + CPU_FOREACH(cpu) { total +=3D kvm_dirty_ring_reap_one(s, cpu); } diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index a783c78..05846f9 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -582,4 +582,6 @@ bool kvm_cpu_check_are_resettable(void); bool kvm_arch_cpu_check_are_resettable(void); =20 bool kvm_dirty_ring_enabled(void); +int64_t kvm_dirty_ring_get_rate(void); +int64_t kvm_dirty_ring_get_peak_rate(void); #endif --=20 1.8.3.1 From nobody Sun Feb 8 14:40:39 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1648005786874872.5308457401159; Tue, 22 Mar 2022 20:23:06 -0700 (PDT) Received: from localhost ([::1]:54838 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWraT-0000YQ-PQ for importer@patchew.org; Tue, 22 Mar 2022 23:23:05 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50552) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nWrWr-0003JJ-DG for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:21 -0400 Received: from prt-mail.chinatelecom.cn ([42.123.76.222]:48755 helo=chinatelecom.cn) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWrWo-0006hq-HJ for qemu-devel@nongnu.org; Tue, 22 Mar 2022 23:19:21 -0400 Received: from clientip-36.111.64.85 (unknown [172.18.0.218]) by chinatelecom.cn (HERMES) with SMTP id 0BD61280492; Wed, 23 Mar 2022 11:19:11 +0800 (CST) Received: from ([172.18.0.218]) by app0025 with ESMTP id 793f379a50204ba0ae3b94601b8d2412 for qemu-devel@nongnu.org; Wed, 23 Mar 2022 11:19:17 CST HMM_SOURCE_IP: 172.18.0.218:52666.2058481772 HMM_ATTACHE_NUM: 0000 HMM_SOURCE_TYPE: SMTP X-189-SAVE-TO-SEND: +wucy11@chinatelecom.cn X-Transaction-ID: 793f379a50204ba0ae3b94601b8d2412 X-Real-From: wucy11@chinatelecom.cn X-Receive-IP: 172.18.0.218 X-MEDUSA-Status: 0 From: wucy11@chinatelecom.cn To: qemu-devel@nongnu.org Subject: [PATCH v1 5/5] migration: Calculate the appropriate throttle for autoconverge Date: Wed, 23 Mar 2022 11:18:38 +0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=42.123.76.222; envelope-from=wucy11@chinatelecom.cn; helo=chinatelecom.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: baiyw2@chinatelecom.cn, yuanmh12@chinatelecom.cn, tugy@chinatelecom.cn, David Hildenbrand , huangy81@chinatelecom.cn, Juan Quintela , Richard Henderson , "Dr. David Alan Gilbert" , Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , yubin1@chinatelecom.cn, dengpc12@chinatelecom.cn, Paolo Bonzini , wucy11@chinatelecom.cn Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1648005787347100001 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Chongyun Wu The current autoconverge algorithm does not obtain the threshold that currently requires the CPU to limit the speed through calculation, but limits the speed of the CPU through continuous attempts. Start from an initial value to limit the speed. If the migration can not be completed for two consecutive rounds, increase the limit threshold and continue to try until the limit threshold reaches 99%. If the memory pressure is high or the migration bandwidth is low, then it will gradually increase from the initial 20% to 99%, which will be a long and time-consuming process. This optimization method calculates a matching rate limit threshold according to the current migration bandwidth and the rate of current dirty page generation. When the memory pressure is high, this optimization can reduce the unnecessary and time-consuming process of constantly trying to increase the limit, and significantly improve the efficiency of migration. Test results after optimization(VM 8C32G, qemu stress tool): mem_stress caculated_auto_converge_throttle bandwidth(MiB/s) 300M 0 1000M 400M 0 1000M 1G 50 1000M 2G 80 1000M 3G 90 1000M 4G 90 1000M 5G 90 1000M 6G 99 1000M 10G 99 1000M 20G 99 1000M 30G 99 1000M Series optimization summary: Related Patch Series: [1]kvm,memory: Optimize dirty page collection for dirty ring [2]kvm: Dynamically control the load of the reaper thread [3]kvm: Dirty ring autoconverge optmization for kvm_cpu_ synchronize_kick_all [4]kvm: Introduce a dirty rate calculation method based on dirty ring [5]migration: Calculate the appropriate throttle for autoconverge Test environment: Host: 64 cpus(Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz), 512G memory, 10G NIC VM: 2 cpus,4G memory and 8 cpus, 32G memory memory stress: run stress(qemu) in VM to generates memory stress Test1: Massive online migration(Run each test item 50 to 200 times) Test command: virsh -t migrate $vm --live --p2p --unsafe --undefinesource --persistent --auto-converge --migrateuri tcp://${data_ip_remote} *********** Use optimized dirtry ring *********** ring_size mem_stress VM average_migration_time(ms) 4096 1G 2C4G 15888 4096 3G 2C4G 13320 65536 1G 2C4G 10036 65536 3G 2C4G 12132 4096 4G 8C32G 53629 4096 8G 8C32G 62474 4096 30G 8C32G 99025 65536 4G 8C32G 45563 65536 8G 8C32G 61114 65536 30G 8C32G 102087 *********** Use Unoptimized dirtry ring *********** ring_size mem_stress VM average_migration_time(ms) 4096 1G 2C4G 23992 4096 3G 2C4G 44234 65536 1G 2C4G 24546 65536 3G 2C4G 44939 4096 4G 8C32G 88441 4096 8G 8C32G may not complete 4096 30G 8C32G 602884 65536 4G 8C32G 335535 65536 8G 8C32G 1249232 65536 30G 8C32G 616939 *********** Use bitmap dirty tracking *********** ring_size mem_stress VM average_migration_time(ms) 0 1G 2C4G 24597 0 3G 2C4G 45254 0 4G 8C32G 103773 0 8G 8C32G 129626 0 30G 8C32G 588212 Compared with the old bitmap method and the unoptimized dirty ring, the migration time of the optimized dirty ring from the sorted data is greatly improved, especially when the virtual machine memory is large and the memory pressure is high, the effect is more obvious, can achieve five to six times the migration acceleration effect. And during the test, it was found that the dirty ring could not be completed for a long time after adding certain memory pressure. The optimized dirty ring did not encounter such a problem. Test2: qemu guestperf test Test ommand parameters: --auto-converge --stress-mem XX --downtime 300 --bandwidth 10000 *********** Use optimized dirtry ring *********** ring_size stress VM Significant_perf max_memory_update cost_time(s) _drop_duration(s) speed(ms/GB) 4096 3G 2C4G 5.5 2962 23.5 65536 3G 2C4G 6 3160 25 4096 3G 8C32G 13 7921 38 4096 6G 8C32G 16 11.6K 46 4096 10G 8C32G 12.1 11.2K 47.6 4096 20G 8C32G 20 20.2K 71 4096 30G 8C32G 29.5 29K 94.5 65536 3G 8C32G 14 8700 40 65536 6G 8C32G 15 12K 46 65536 10G 8C32G 11.5 11.1k 47.5 65536 20G 8C32G 21 20.9K 72 65536 30G 8C32G 29.5 29.1K 94.5 *********** Use Unoptimized dirtry ring *********** ring_size stress VM Significant_perf max_memory_update cost_time(s) _drop_duration(s) speed(ms/GB) 4096 3G 2C4G 23 2766 46 65536 3G 2C4G 22.2 3283 46 4096 3G 8C32G 62 48.8K 106 4096 6G 8C32G 68 23.87K 124 4096 10G 8C32G 91 16.87K 190 4096 20G 8C32G 152.8 28.65K 336.8 4096 30G 8C32G 187 41.19K 502 65536 3G 8C32G 71 12.7K 67 65536 6G 8C32G 63 12K 46 65536 10G 8C32G 88 25.3k 120 65536 20G 8C32G 157.3 25K 391 65536 30G 8C32G 171 30.8K 487 *********** Use bitmap dirty tracking *********** ring_size stress VM Significant_perf max_memory_update cost_time(s) _drop_duration(s) speed(ms/GB) 0 3G 2C4G 18 3300 38 0 3G 8C32G 38 7571 66 0 6G 8C32G 61.5 10.5K 115.5 0 10G 8C32G 110 13.68k 180 0 20G 8C32G 161.6 24.4K 280 0 30G 8C32G 221.5 28.4K 337.5 The above test data shows that the guestperf performance of the optimized dirty ring during the migration process is significantly better than that of the unoptimized dirty ring, and slightly better than the bitmap method. During the migration process of the optimized dirty ring, the migration time is greatly reduced, and the time in the period of significant memory performance degradation is significantly shorter than that of the bitmap mode and the unoptimized dirty ring mode. Therefore, the optimized ditry ring can better reduce the impact on guests accessing memory resources during the migration process. Signed-off-by: Chongyun Wu --- accel/kvm/kvm-all.c | 7 ++++-- migration/migration.c | 1 + migration/migration.h | 2 ++ migration/ram.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++= +--- 4 files changed, 69 insertions(+), 5 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index a158b1c..57126f1 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -800,7 +800,11 @@ int64_t kvm_dirty_ring_get_rate(void) =20 int64_t kvm_dirty_ring_get_peak_rate(void) { - return kvm_state->reaper.counter.dirty_pages_period_peak_rate; + int64_t rate =3D kvm_state->reaper.counter.dirty_pages_period_peak_rat= e; + /* Reset the peak rate to calculate a new peak rate after this moment = */ + kvm_state->reaper.counter.dirty_pages_period_peak_rate =3D 0; + + return rate; } =20 static void kvm_dirty_ring_reap_count(KVMState *s) @@ -836,7 +840,6 @@ static void kvm_dirty_ring_reap_count(KVMState *s) s->reaper.counter.dirty_pages_period_peak_rate =3D s->reaper.counter.dirty_pages_rate; } - /* Reset counters */ s->reaper.counter.dirty_pages_period =3D 0; s->reaper.counter.time_last_count =3D 0; diff --git a/migration/migration.c b/migration/migration.c index ca1db88..78ecf8c 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2070,6 +2070,7 @@ void migrate_init(MigrationState *s) s->vm_was_running =3D false; s->iteration_initial_bytes =3D 0; s->threshold_size =3D 0; + s->have_caculated_throttle_pct =3D false; } =20 int migrate_add_blocker_internal(Error *reason, Error **errp) diff --git a/migration/migration.h b/migration/migration.h index 2de861d..7c525c9 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -333,6 +333,8 @@ struct MigrationState { * This save hostname when out-going migration starts */ char *hostname; + /* If already caculated the throttle percentage for migration */ + bool have_caculated_throttle_pct; }; =20 void migrate_set_state(int *state, int old_state, int new_state); diff --git a/migration/ram.c b/migration/ram.c index 170e522..21642eb 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -63,6 +63,7 @@ #include "qemu/userfaultfd.h" #endif /* defined(__linux__) */ =20 +#include "sysemu/kvm.h" /***********************************************************/ /* ram save/restore */ =20 @@ -617,6 +618,64 @@ static size_t save_page_header(RAMState *rs, QEMUFile = *f, RAMBlock *block, } =20 /** + * Calculate the matched speed limit threshold according to + * the current migration bandwidth and the current rate of + * dirty pages to aviod time-consuming and pointless attempt. + */ +static int calculate_throttle_pct(void) +{ + MigrationState *s =3D migrate_get_current(); + uint64_t threshold =3D s->parameters.throttle_trigger_threshold; + uint64_t pct_initial =3D s->parameters.cpu_throttle_initial; + uint64_t pct_icrement =3D s->parameters.cpu_throttle_increment; + int pct_max =3D s->parameters.max_cpu_throttle; + + int matched_pct =3D 0; + float facter1 =3D 0.0; + float facter2 =3D 0.0; + int64_t dirty_pages_rate =3D 0; + double bandwith_expect =3D 0.0; + double dirty_pages_rate_expect =3D 0.0; + double bandwith =3D (s->mbps / 8) * 1024 * 1024; + + if (kvm_dirty_ring_enabled()) { + dirty_pages_rate =3D kvm_dirty_ring_get_peak_rate() * + qemu_target_page_size(); + } else { + dirty_pages_rate =3D ram_counters.dirty_pages_rate * + qemu_target_page_size(); + } + + if (dirty_pages_rate) { + facter1 =3D (float)threshold / 100; + bandwith_expect =3D bandwith * facter1; + + for (uint64_t i =3D pct_initial; i <=3D pct_max;) { + facter2 =3D 1 - (float)i / 100; + dirty_pages_rate_expect =3D dirty_pages_rate * facter2; + + if (bandwith_expect > dirty_pages_rate_expect) { + matched_pct =3D i; + break; + } + i +=3D pct_icrement; + } + + if (!matched_pct) { + info_report("Not find matched throttle pct, " + "maybe pressure too high, use max"); + matched_pct =3D pct_max; + } + } else { + matched_pct =3D pct_initial; + } + + s->have_caculated_throttle_pct =3D true; + + return matched_pct; +} + +/** * mig_throttle_guest_down: throttle down the guest * * Reduce amount of guest cpu execution to hopefully slow down memory @@ -629,7 +688,6 @@ static void mig_throttle_guest_down(uint64_t bytes_dirt= y_period, uint64_t bytes_dirty_threshold) { MigrationState *s =3D migrate_get_current(); - uint64_t pct_initial =3D s->parameters.cpu_throttle_initial; uint64_t pct_increment =3D s->parameters.cpu_throttle_increment; bool pct_tailslow =3D s->parameters.cpu_throttle_tailslow; int pct_max =3D s->parameters.max_cpu_throttle; @@ -638,8 +696,8 @@ static void mig_throttle_guest_down(uint64_t bytes_dirt= y_period, uint64_t cpu_now, cpu_ideal, throttle_inc; =20 /* We have not started throttling yet. Let's start it. */ - if (!cpu_throttle_active()) { - cpu_throttle_set(pct_initial); + if (!s->have_caculated_throttle_pct) { + cpu_throttle_set(MIN(calculate_throttle_pct(), pct_max)); } else { /* Throttling already on, just increase the rate */ if (!pct_tailslow) { --=20 1.8.3.1