From nobody Mon Feb 9 14:34:26 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1637215989228123.25590786951568; Wed, 17 Nov 2021 22:13:09 -0800 (PST) Received: from localhost ([::1]:42506 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mnafT-000856-QN for importer@patchew.org; Thu, 18 Nov 2021 01:13:07 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51414) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mnabD-0005FV-Qi for qemu-devel@nongnu.org; Thu, 18 Nov 2021 01:08:44 -0500 Received: from prt-mail.chinatelecom.cn ([42.123.76.222]:51720 helo=chinatelecom.cn) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mnabA-0000Hv-BT for qemu-devel@nongnu.org; Thu, 18 Nov 2021 01:08:43 -0500 Received: from clientip-182.150.57.243 (unknown [172.18.0.48]) by chinatelecom.cn (HERMES) with SMTP id DBD822800C6; Thu, 18 Nov 2021 14:08:27 +0800 (CST) Received: from ([172.18.0.48]) by app0024 with ESMTP id 0396b9afe5b14eeb95f8bd07ac6c3477 for qemu-devel@nongnu.org; Thu, 18 Nov 2021 14:08:29 CST HMM_SOURCE_IP: 172.18.0.48:38028.1063050567 HMM_ATTACHE_NUM: 0000 HMM_SOURCE_TYPE: SMTP X-189-SAVE-TO-SEND: +huangy81@chinatelecom.cn X-Transaction-ID: 0396b9afe5b14eeb95f8bd07ac6c3477 X-Real-From: huangy81@chinatelecom.cn X-Receive-IP: 172.18.0.48 X-MEDUSA-Status: 0 From: huangy81@chinatelecom.cn To: qemu-devel Subject: [PATCH v1 2/3] cpu-throttle: implement vCPU throttle Date: Thu, 18 Nov 2021 14:07:21 +0800 Message-Id: <0537c7d112932f2d99df3dc0587bb246328e2d9d.1637214721.git.huangy81@chinatelecom.cn> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=42.123.76.222; envelope-from=huangy81@chinatelecom.cn; helo=chinatelecom.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: David Hildenbrand , Hyman , Juan Quintela , Richard Henderson , "Dr. David Alan Gilbert" , Peter Xu , Paolo Bonzini , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1637215991537100001 From: Hyman Huang(=E9=BB=84=E5=8B=87) implement dirty restraint by kicking each vcpu as the auto-converge does during migration, but just kick the specified vcpu instead, not all the vcpu of vm. start a thread to track the dirty restraint status and adjuct the throttle pencentage dynamically depend on current and quota dirtyrate . introduce the util function in the header for the dirty restraint implemantataion. Signed-off-by: Hyman Huang(=E9=BB=84=E5=8B=87) --- include/sysemu/cpu-throttle.h | 21 +++ include/sysemu/dirtyrestraint.h | 2 + migration/dirtyrate.c | 7 + softmmu/cpu-throttle.c | 304 ++++++++++++++++++++++++++++++++++++= ++++ softmmu/trace-events | 5 + 5 files changed, 339 insertions(+) diff --git a/include/sysemu/cpu-throttle.h b/include/sysemu/cpu-throttle.h index d65bdef..48215d2 100644 --- a/include/sysemu/cpu-throttle.h +++ b/include/sysemu/cpu-throttle.h @@ -65,4 +65,25 @@ bool cpu_throttle_active(void); */ int cpu_throttle_get_percentage(void); =20 +/** + * dirtyrestraint_state_init: + * + * initialize golobal state for dirty restraint + */ +void dirtyrestraint_state_init(int max_cpus); + +/** + * dirtyrestraint_vcpu: + * + * impose dirty restraint on vcpu util reaching the quota dirtyrate + */ +void dirtyrestraint_vcpu(int cpu_index, + uint64_t quota); +/** + * dirtyrestraint_cancel_vcpu: + * + * cancel dirty restraint for the specified vcpu + */ +void dirtyrestraint_cancel_vcpu(int cpu_index); + #endif /* SYSEMU_CPU_THROTTLE_H */ diff --git a/include/sysemu/dirtyrestraint.h b/include/sysemu/dirtyrestrain= t.h index ca744af..b84a5c0 100644 --- a/include/sysemu/dirtyrestraint.h +++ b/include/sysemu/dirtyrestraint.h @@ -14,6 +14,8 @@ =20 #define DIRTYRESTRAINT_CALC_PERIOD_TIME_S 15 /* 15s */ =20 +int64_t dirtyrestraint_calc_current(int cpu_index); + void dirtyrestraint_calc_start(void); =20 void dirtyrestraint_calc_state_init(int max_cpus); diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c index b453b3a..26919ff 100644 --- a/migration/dirtyrate.c +++ b/migration/dirtyrate.c @@ -137,6 +137,13 @@ static void *dirtyrestraint_calc_thread(void *opaque) return NULL; } =20 +int64_t dirtyrestraint_calc_current(int cpu_index) +{ + DirtyRateVcpu *rates =3D dirtyrestraint_calc_state->data.rates; + + return qatomic_read(&rates[cpu_index].dirty_rate); +} + void dirtyrestraint_calc_start(void) { if (likely(!qatomic_read(&dirtyrestraint_calc_state->enable))) { diff --git a/softmmu/cpu-throttle.c b/softmmu/cpu-throttle.c index 8c2144a..7a127a0 100644 --- a/softmmu/cpu-throttle.c +++ b/softmmu/cpu-throttle.c @@ -29,6 +29,8 @@ #include "qemu/main-loop.h" #include "sysemu/cpus.h" #include "sysemu/cpu-throttle.h" +#include "sysemu/dirtyrestraint.h" +#include "trace.h" =20 /* vcpu throttling controls */ static QEMUTimer *throttle_timer; @@ -38,6 +40,308 @@ static unsigned int throttle_percentage; #define CPU_THROTTLE_PCT_MAX 99 #define CPU_THROTTLE_TIMESLICE_NS 10000000 =20 +#define DIRTYRESTRAINT_TOLERANCE_RANGE 15 /* 15MB/s */ + +#define DIRTYRESTRAINT_THROTTLE_HEAVY_WATERMARK 75 +#define DIRTYRESTRAINT_THROTTLE_SLIGHT_WATERMARK 90 + +#define DIRTYRESTRAINT_THROTTLE_HEAVY_STEP_SIZE 5 +#define DIRTYRESTRAINT_THROTTLE_SLIGHT_STEP_SIZE 2 + +typedef enum { + RESTRAIN_KEEP, + RESTRAIN_RATIO, + RESTRAIN_HEAVY, + RESTRAIN_SLIGHT, +} RestrainPolicy; + +typedef struct DirtyRestraintState { + int cpu_index; + bool enabled; + uint64_t quota; /* quota dirtyrate MB/s */ + QemuThread thread; + char *name; /* thread name */ +} DirtyRestraintState; + +struct { + DirtyRestraintState *states; + int max_cpus; +} *dirtyrestraint_state; + +static inline bool dirtyrestraint_enabled(int cpu_index) +{ + return qatomic_read(&dirtyrestraint_state->states[cpu_index].enabled); +} + +static inline void dirtyrestraint_set_quota(int cpu_index, uint64_t quota) +{ + qatomic_set(&dirtyrestraint_state->states[cpu_index].quota, quota); +} + +static inline uint64_t dirtyrestraint_quota(int cpu_index) +{ + return qatomic_read(&dirtyrestraint_state->states[cpu_index].quota); +} + +static int64_t dirtyrestraint_current(int cpu_index) +{ + return dirtyrestraint_calc_current(cpu_index); +} + +static void dirtyrestraint_vcpu_thread(CPUState *cpu, run_on_cpu_data data) +{ + double pct; + double throttle_ratio; + int64_t sleeptime_ns, endtime_ns; + int *percentage =3D (int *)data.host_ptr; + + pct =3D (double)(*percentage) / 100; + throttle_ratio =3D pct / (1 - pct); + /* Add 1ns to fix double's rounding error (like 0.9999999...) */ + sleeptime_ns =3D (int64_t)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS = + 1); + endtime_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + sleeptime_ns; + while (sleeptime_ns > 0 && !cpu->stop) { + if (sleeptime_ns > SCALE_MS) { + qemu_cond_timedwait_iothread(cpu->halt_cond, + sleeptime_ns / SCALE_MS); + } else { + qemu_mutex_unlock_iothread(); + g_usleep(sleeptime_ns / SCALE_US); + qemu_mutex_lock_iothread(); + } + sleeptime_ns =3D endtime_ns - qemu_clock_get_ns(QEMU_CLOCK_REALTIM= E); + } + qatomic_set(&cpu->throttle_thread_scheduled, 0); + + free(percentage); +} + +static void do_dirtyrestraint(int cpu_index, + int percentage) +{ + CPUState *cpu; + int64_t sleeptime_ns, starttime_ms, currenttime_ms; + int *pct_parameter; + double pct; + + pct =3D (double) percentage / 100; + + starttime_ms =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + + while (true) { + CPU_FOREACH(cpu) { + if ((cpu_index =3D=3D cpu->cpu_index) && + (!qatomic_xchg(&cpu->throttle_thread_scheduled, 1))) { + pct_parameter =3D malloc(sizeof(*pct_parameter)); + *pct_parameter =3D percentage; + async_run_on_cpu(cpu, dirtyrestraint_vcpu_thread, + RUN_ON_CPU_HOST_PTR(pct_parameter)); + break; + } + } + + sleeptime_ns =3D CPU_THROTTLE_TIMESLICE_NS / (1 - pct); + g_usleep(sleeptime_ns / SCALE_US); + + currenttime_ms =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + if (unlikely((currenttime_ms - starttime_ms) > + (DIRTYRESTRAINT_CALC_PERIOD_TIME_S * 1000))) { + break; + } + } +} + +static uint64_t dirtyrestraint_init_pct(uint64_t quota, + uint64_t current) +{ + uint64_t restraint_pct =3D 0; + + if (quota >=3D current || (current =3D=3D 0) || + ((current - quota) <=3D DIRTYRESTRAINT_TOLERANCE_RANGE)) { + restraint_pct =3D 0; + } else { + restraint_pct =3D (current - quota) * 100 / current; + + restraint_pct =3D MIN(restraint_pct, + DIRTYRESTRAINT_THROTTLE_HEAVY_WATERMARK); + } + + return restraint_pct; +} + +static RestrainPolicy dirtyrestraint_policy(unsigned int last_pct, + uint64_t quota, + uint64_t current) +{ + uint64_t max, min; + + max =3D MAX(quota, current); + min =3D MIN(quota, current); + if ((max - min) <=3D DIRTYRESTRAINT_TOLERANCE_RANGE) { + return RESTRAIN_KEEP; + } + if (last_pct < DIRTYRESTRAINT_THROTTLE_HEAVY_WATERMARK) { + /* last percentage locates in [0, 75)*/ + return RESTRAIN_RATIO; + } else if (last_pct < DIRTYRESTRAINT_THROTTLE_SLIGHT_WATERMARK) { + /* last percentage locates in [75, 90)*/ + return RESTRAIN_HEAVY; + } else { + /* last percentage locates in [90, 99]*/ + return RESTRAIN_SLIGHT; + } +} + +static uint64_t dirtyrestraint_pct(unsigned int last_pct, + uint64_t quota, + uint64_t current) +{ + uint64_t restraint_pct =3D 0; + RestrainPolicy policy; + bool mitigate =3D (quota > current) ? true : false; + + if (mitigate && ((current =3D=3D 0) || + (last_pct <=3D DIRTYRESTRAINT_THROTTLE_SLIGHT_STEP_SIZE))) { + return 0; + } + + policy =3D dirtyrestraint_policy(last_pct, quota, current); + switch (policy) { + case RESTRAIN_SLIGHT: + /* [90, 99] */ + if (mitigate) { + restraint_pct =3D + last_pct - DIRTYRESTRAINT_THROTTLE_SLIGHT_STEP_SIZE; + } else { + restraint_pct =3D + last_pct + DIRTYRESTRAINT_THROTTLE_SLIGHT_STEP_SIZE; + + restraint_pct =3D MIN(restraint_pct, CPU_THROTTLE_PCT_MAX); + } + break; + case RESTRAIN_HEAVY: + /* [75, 90) */ + if (mitigate) { + restraint_pct =3D + last_pct - DIRTYRESTRAINT_THROTTLE_HEAVY_STEP_SIZE; + } else { + restraint_pct =3D + last_pct + DIRTYRESTRAINT_THROTTLE_HEAVY_STEP_SIZE; + + restraint_pct =3D MIN(restraint_pct, + DIRTYRESTRAINT_THROTTLE_SLIGHT_WATERMARK); + } + break; + case RESTRAIN_RATIO: + /* [0, 75) */ + if (mitigate) { + if (last_pct <=3D (((quota - current) * 100 / quota) / 2)) { + restraint_pct =3D 0; + } else { + restraint_pct =3D last_pct - + ((quota - current) * 100 / quota) / 2; + restraint_pct =3D MAX(restraint_pct, CPU_THROTTLE_PCT_MIN); + } + } else { + /* + * increase linearly with dirtyrate + * but tune a little by divide it by 2 + */ + restraint_pct =3D last_pct + + ((current - quota) * 100 / current) / 2; + + restraint_pct =3D MIN(restraint_pct, + DIRTYRESTRAINT_THROTTLE_HEAVY_WATERMARK); + } + break; + case RESTRAIN_KEEP: + default: + restraint_pct =3D last_pct; + break; + } + + return restraint_pct; +} + +static void *dirtyrestraint_thread(void *opaque) +{ + int cpu_index =3D *(int *)opaque; + uint64_t quota_dirtyrate, current_dirtyrate; + unsigned int last_pct =3D 0; + unsigned int pct =3D 0; + + rcu_register_thread(); + + quota_dirtyrate =3D dirtyrestraint_quota(cpu_index); + current_dirtyrate =3D dirtyrestraint_current(cpu_index); + + pct =3D dirtyrestraint_init_pct(quota_dirtyrate, current_dirtyrate); + + do { + trace_dirtyrestraint_impose(cpu_index, + quota_dirtyrate, current_dirtyrate, pct); + if (pct =3D=3D 0) { + sleep(DIRTYRESTRAINT_CALC_PERIOD_TIME_S); + } else { + last_pct =3D pct; + do_dirtyrestraint(cpu_index, pct); + } + + quota_dirtyrate =3D dirtyrestraint_quota(cpu_index); + current_dirtyrate =3D dirtyrestraint_current(cpu_index); + + pct =3D dirtyrestraint_pct(last_pct, quota_dirtyrate, current_dirt= yrate); + } while (dirtyrestraint_enabled(cpu_index)); + + rcu_unregister_thread(); + + return NULL; +} + +void dirtyrestraint_cancel_vcpu(int cpu_index) +{ + qatomic_set(&dirtyrestraint_state->states[cpu_index].enabled, 0); +} + +void dirtyrestraint_vcpu(int cpu_index, + uint64_t quota) +{ + trace_dirtyrestraint_vcpu(cpu_index, quota); + + dirtyrestraint_set_quota(cpu_index, quota); + + if (unlikely(!dirtyrestraint_enabled(cpu_index))) { + qatomic_set(&dirtyrestraint_state->states[cpu_index].enabled, 1); + dirtyrestraint_state->states[cpu_index].name =3D + g_strdup_printf("dirtyrestraint-%d", cpu_index); + qemu_thread_create(&dirtyrestraint_state->states[cpu_index].thread, + dirtyrestraint_state->states[cpu_index].name, + dirtyrestraint_thread, + (void *)&dirtyrestraint_state->states[cpu_index].cpu_index, + QEMU_THREAD_DETACHED); + } + + return; +} + +void dirtyrestraint_state_init(int max_cpus) +{ + int i; + + dirtyrestraint_state =3D g_malloc0(sizeof(*dirtyrestraint_state)); + + dirtyrestraint_state->states =3D + g_malloc0(sizeof(DirtyRestraintState) * max_cpus); + + for (i =3D 0; i < max_cpus; i++) { + dirtyrestraint_state->states[i].cpu_index =3D i; + } + + dirtyrestraint_state->max_cpus =3D max_cpus; + + trace_dirtyrestraint_state_init(max_cpus); +} + static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque) { double pct; diff --git a/softmmu/trace-events b/softmmu/trace-events index 9c88887..0307567 100644 --- a/softmmu/trace-events +++ b/softmmu/trace-events @@ -31,3 +31,8 @@ runstate_set(int current_state, const char *current_state= _str, int new_state, co system_wakeup_request(int reason) "reason=3D%d" qemu_system_shutdown_request(int reason) "reason=3D%d" qemu_system_powerdown_request(void) "" + +#cpu-throttle.c +dirtyrestraint_state_init(int max_cpus) "dirtyrate restraint init: max cpu= s %d" +dirtyrestraint_impose(int cpu_index, uint64_t quota, uint64_t current, int= pct) "CPU[%d] impose dirtyrate restraint: quota %" PRIu64 ", current %" PR= Iu64 ", percentage %d" +dirtyrestraint_vcpu(int cpu_index, uint64_t quota) "CPU[%d] dirtyrate rest= raint, quota dirtyrate %"PRIu64 --=20 1.8.3.1