From nobody Thu Apr 25 09:48:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1590087342; cv=none; d=zohomail.com; s=zohoarc; b=e0pT9cfdpmcqnaZaSOERwEXLEF3p9zswcmelNca47+NsvCX4lo+WpmpyrmfxSqdMcPg0IvwelKHAyy0IebbO2kOiPwhHSdRKN1liErKYDwI9rxkzvAqeKfnYf2rdBV/ugWEc0eh4ackRws7yJq2fqvnT1LVt7nveSYpsq0xxtTs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1590087342; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To; bh=YhIgy1Dlz/IC/nTk/8RAl7lz0vQug70v0BzMwBv+YcY=; b=ebdp/KU1bfSCE7hFPay+98MsCf7tBECKxG4wd0m80I4/Wd0KbhBlOMtM0fxpf+W/+YWcfM0bIUciuUW64HGnwHTcOxRLDa5Yh0/HSlfFMB4oBbJlFmsEC1eJ+yH89QKmbwUFi3zye36rkxYhtETPvErhixX55KN1OhyyHIL0lkk= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1590087342058819.1363727768228; Thu, 21 May 2020 11:55:42 -0700 (PDT) Received: from localhost ([::1]:42162 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jbqM0-0002Ji-G1 for importer@patchew.org; Thu, 21 May 2020 14:55:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35252) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbqKd-0000v2-MK for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:15 -0400 Received: from mx2.suse.de ([195.135.220.15]:49886) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbqKb-00040i-VX for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:15 -0400 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id B8C4DAFF8; Thu, 21 May 2020 18:54:14 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de From: Claudio Fontana To: Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Peter Maydell , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [RFC 1/3] cpu-throttle: new module, extracted from cpus.c Date: Thu, 21 May 2020 20:54:05 +0200 Message-Id: <20200521185407.25311-2-cfontana@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200521185407.25311-1-cfontana@suse.de> References: <20200521185407.25311-1-cfontana@suse.de> Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=195.135.220.15; envelope-from=cfontana@suse.de; helo=mx2.suse.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/21 02:02:44 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x (no timestamps) [generic] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Thomas Huth , Eduardo Habkost , Marcelo Tosatti , "open list:All patches CC here" , Roman Bolshakov , Wenchao Wang , Colin Xu , Claudio Fontana , "open list:X86 HAXM CPUs" , Sunil Muthuswamy , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" this is a first step in the refactoring of cpus.c. Signed-off-by: Claudio Fontana --- MAINTAINERS | 1 + Makefile.target | 8 ++- cpu-throttle.c | 122 ++++++++++++++++++++++++++++++++++++++= ++++ cpus.c | 95 +++----------------------------- include/hw/core/cpu.h | 37 ------------- include/qemu/main-loop.h | 5 ++ include/sysemu/cpu-throttle.h | 50 +++++++++++++++++ migration/migration.c | 1 + migration/ram.c | 1 + 9 files changed, 195 insertions(+), 125 deletions(-) create mode 100644 cpu-throttle.c create mode 100644 include/sysemu/cpu-throttle.h diff --git a/MAINTAINERS b/MAINTAINERS index 87a412c229..35864a275a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2142,6 +2142,7 @@ Main loop M: Paolo Bonzini S: Maintained F: cpus.c +F: cpu-throttle.c F: include/qemu/main-loop.h F: include/sysemu/runstate.h F: util/main-loop.c diff --git a/Makefile.target b/Makefile.target index 8ed1eba95b..60cfa2a78b 100644 --- a/Makefile.target +++ b/Makefile.target @@ -152,7 +152,13 @@ endif #CONFIG_BSD_USER ######################################################### # System emulator target ifdef CONFIG_SOFTMMU -obj-y +=3D arch_init.o cpus.o gdbstub.o balloon.o ioport.o +obj-y +=3D arch_init.o +obj-y +=3D cpus.o +obj-y +=3D cpu-throttle.o +obj-y +=3D gdbstub.o +obj-y +=3D balloon.o +obj-y +=3D ioport.o + obj-y +=3D qtest.o obj-y +=3D dump/ obj-y +=3D hw/ diff --git a/cpu-throttle.c b/cpu-throttle.c new file mode 100644 index 0000000000..4e6b2818ca --- /dev/null +++ b/cpu-throttle.c @@ -0,0 +1,122 @@ +/* + * QEMU System Emulator + * + * Copyright (c) 2003-2008 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a= copy + * of this software and associated documentation files (the "Software"), t= o deal + * in the Software without restriction, including without limitation the r= ights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or se= ll + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included= in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS= OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OT= HER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING= FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS = IN + * THE SOFTWARE. + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" +#include "qemu/thread.h" +#include "hw/core/cpu.h" +#include "qemu/main-loop.h" +#include "sysemu/cpus.h" +#include "sysemu/cpu-throttle.h" + +/* vcpu throttling controls */ +static QEMUTimer *throttle_timer; +static unsigned int throttle_percentage; + +#define CPU_THROTTLE_PCT_MIN 1 +#define CPU_THROTTLE_PCT_MAX 99 +#define CPU_THROTTLE_TIMESLICE_NS 10000000 + +static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque) +{ + double pct; + double throttle_ratio; + int64_t sleeptime_ns, endtime_ns; + + if (!cpu_throttle_get_percentage()) { + return; + } + + pct =3D (double)cpu_throttle_get_percentage() / 100; + throttle_ratio =3D pct / (1 - pct); + /* Add 1ns to fix double's rounding error (like 0.9999999...) */ + sleeptime_ns =3D (int64_t)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS = + 1); + endtime_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + sleeptime_ns; + while (sleeptime_ns > 0 && !cpu->stop) { + if (sleeptime_ns > SCALE_MS) { + qemu_cond_timedwait_iothread(cpu->halt_cond, + sleeptime_ns / SCALE_MS); + } else { + qemu_mutex_unlock_iothread(); + g_usleep(sleeptime_ns / SCALE_US); + qemu_mutex_lock_iothread(); + } + sleeptime_ns =3D endtime_ns - qemu_clock_get_ns(QEMU_CLOCK_REALTIM= E); + } + atomic_set(&cpu->throttle_thread_scheduled, 0); +} + +static void cpu_throttle_timer_tick(void *opaque) +{ + CPUState *cpu; + double pct; + + /* Stop the timer if needed */ + if (!cpu_throttle_get_percentage()) { + return; + } + CPU_FOREACH(cpu) { + if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) { + async_run_on_cpu(cpu, cpu_throttle_thread, + RUN_ON_CPU_NULL); + } + } + + pct =3D (double)cpu_throttle_get_percentage() / 100; + timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) + + CPU_THROTTLE_TIMESLICE_NS / (1 - pct)); +} + +void cpu_throttle_set(int new_throttle_pct) +{ + /* Ensure throttle percentage is within valid range */ + new_throttle_pct =3D MIN(new_throttle_pct, CPU_THROTTLE_PCT_MAX); + new_throttle_pct =3D MAX(new_throttle_pct, CPU_THROTTLE_PCT_MIN); + + atomic_set(&throttle_percentage, new_throttle_pct); + + timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) + + CPU_THROTTLE_TIMESLICE_NS); +} + +void cpu_throttle_stop(void) +{ + atomic_set(&throttle_percentage, 0); +} + +bool cpu_throttle_active(void) +{ + return (cpu_throttle_get_percentage() !=3D 0); +} + +int cpu_throttle_get_percentage(void) +{ + return atomic_read(&throttle_percentage); +} + +void cpu_throttle_init(void) +{ + throttle_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL_RT, + cpu_throttle_timer_tick, NULL); +} diff --git a/cpus.c b/cpus.c index 5670c96bcf..3a46a4fc2b 100644 --- a/cpus.c +++ b/cpus.c @@ -61,6 +61,8 @@ #include "hw/boards.h" #include "hw/hw.h" =20 +#include "sysemu/cpu-throttle.h" + #ifdef CONFIG_LINUX =20 #include @@ -84,14 +86,6 @@ static QemuMutex qemu_global_mutex; int64_t max_delay; int64_t max_advance; =20 -/* vcpu throttling controls */ -static QEMUTimer *throttle_timer; -static unsigned int throttle_percentage; - -#define CPU_THROTTLE_PCT_MIN 1 -#define CPU_THROTTLE_PCT_MAX 99 -#define CPU_THROTTLE_TIMESLICE_NS 10000000 - bool cpu_is_stopped(CPUState *cpu) { return cpu->stopped || !runstate_is_running(); @@ -710,90 +704,12 @@ static const VMStateDescription vmstate_timers =3D { } }; =20 -static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque) -{ - double pct; - double throttle_ratio; - int64_t sleeptime_ns, endtime_ns; - - if (!cpu_throttle_get_percentage()) { - return; - } - - pct =3D (double)cpu_throttle_get_percentage()/100; - throttle_ratio =3D pct / (1 - pct); - /* Add 1ns to fix double's rounding error (like 0.9999999...) */ - sleeptime_ns =3D (int64_t)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS = + 1); - endtime_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + sleeptime_ns; - while (sleeptime_ns > 0 && !cpu->stop) { - if (sleeptime_ns > SCALE_MS) { - qemu_cond_timedwait(cpu->halt_cond, &qemu_global_mutex, - sleeptime_ns / SCALE_MS); - } else { - qemu_mutex_unlock_iothread(); - g_usleep(sleeptime_ns / SCALE_US); - qemu_mutex_lock_iothread(); - } - sleeptime_ns =3D endtime_ns - qemu_clock_get_ns(QEMU_CLOCK_REALTIM= E); - } - atomic_set(&cpu->throttle_thread_scheduled, 0); -} - -static void cpu_throttle_timer_tick(void *opaque) -{ - CPUState *cpu; - double pct; - - /* Stop the timer if needed */ - if (!cpu_throttle_get_percentage()) { - return; - } - CPU_FOREACH(cpu) { - if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) { - async_run_on_cpu(cpu, cpu_throttle_thread, - RUN_ON_CPU_NULL); - } - } - - pct =3D (double)cpu_throttle_get_percentage()/100; - timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) + - CPU_THROTTLE_TIMESLICE_NS / (1-pct)); -} - -void cpu_throttle_set(int new_throttle_pct) -{ - /* Ensure throttle percentage is within valid range */ - new_throttle_pct =3D MIN(new_throttle_pct, CPU_THROTTLE_PCT_MAX); - new_throttle_pct =3D MAX(new_throttle_pct, CPU_THROTTLE_PCT_MIN); - - atomic_set(&throttle_percentage, new_throttle_pct); - - timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) + - CPU_THROTTLE_TIMESLICE_NS); -} - -void cpu_throttle_stop(void) -{ - atomic_set(&throttle_percentage, 0); -} - -bool cpu_throttle_active(void) -{ - return (cpu_throttle_get_percentage() !=3D 0); -} - -int cpu_throttle_get_percentage(void) -{ - return atomic_read(&throttle_percentage); -} - void cpu_ticks_init(void) { seqlock_init(&timers_state.vm_clock_seqlock); qemu_spin_init(&timers_state.vm_clock_lock); vmstate_register(NULL, 0, &vmstate_timers, &timers_state); - throttle_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL_RT, - cpu_throttle_timer_tick, NULL); + cpu_throttle_init(); } =20 void configure_icount(QemuOpts *opts, Error **errp) @@ -1852,6 +1768,11 @@ void qemu_cond_wait_iothread(QemuCond *cond) qemu_cond_wait(cond, &qemu_global_mutex); } =20 +void qemu_cond_timedwait_iothread(QemuCond *cond, int ms) +{ + qemu_cond_timedwait(cond, &qemu_global_mutex, ms); +} + static bool all_vcpus_paused(void) { CPUState *cpu; diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index 07f7698155..e6b75d456c 100644 --- a/include/hw/core/cpu.h +++ b/include/hw/core/cpu.h @@ -817,43 +817,6 @@ bool cpu_exists(int64_t id); */ CPUState *cpu_by_arch_id(int64_t id); =20 -/** - * cpu_throttle_set: - * @new_throttle_pct: Percent of sleep time. Valid range is 1 to 99. - * - * Throttles all vcpus by forcing them to sleep for the given percentage of - * time. A throttle_percentage of 25 corresponds to a 75% duty cycle rough= ly. - * (example: 10ms sleep for every 30ms awake). - * - * cpu_throttle_set can be called as needed to adjust new_throttle_pct. - * Once the throttling starts, it will remain in effect until cpu_throttle= _stop - * is called. - */ -void cpu_throttle_set(int new_throttle_pct); - -/** - * cpu_throttle_stop: - * - * Stops the vcpu throttling started by cpu_throttle_set. - */ -void cpu_throttle_stop(void); - -/** - * cpu_throttle_active: - * - * Returns: %true if the vcpus are currently being throttled, %false other= wise. - */ -bool cpu_throttle_active(void); - -/** - * cpu_throttle_get_percentage: - * - * Returns the vcpu throttle percentage. See cpu_throttle_set for details. - * - * Returns: The throttle percentage in range 1 to 99. - */ -int cpu_throttle_get_percentage(void); - #ifndef CONFIG_USER_ONLY =20 typedef void (*CPUInterruptHandler)(CPUState *, int); diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h index a6d20b0719..2fa3d90ad6 100644 --- a/include/qemu/main-loop.h +++ b/include/qemu/main-loop.h @@ -263,6 +263,11 @@ int qemu_add_child_watch(pid_t pid); */ bool qemu_mutex_iothread_locked(void); =20 +/* + * qemu_cond_timedwait_iothread: like the previous, but with timeout + */ +void qemu_cond_timedwait_iothread(QemuCond *cond, int ms); + /** * qemu_mutex_lock_iothread: Lock the main loop mutex. * diff --git a/include/sysemu/cpu-throttle.h b/include/sysemu/cpu-throttle.h new file mode 100644 index 0000000000..22356502a5 --- /dev/null +++ b/include/sysemu/cpu-throttle.h @@ -0,0 +1,50 @@ +#ifndef SYSEMU_CPU_THROTTLE_H +#define SYSEMU_CPU_THROTTLE_H + +#include "qemu/timer.h" + +/** + * cpu_throttle_init: + * + * Initialize the CPU throttling API. + */ +void cpu_throttle_init(void); + +/** + * cpu_throttle_set: + * @new_throttle_pct: Percent of sleep time. Valid range is 1 to 99. + * + * Throttles all vcpus by forcing them to sleep for the given percentage of + * time. A throttle_percentage of 25 corresponds to a 75% duty cycle rough= ly. + * (example: 10ms sleep for every 30ms awake). + * + * cpu_throttle_set can be called as needed to adjust new_throttle_pct. + * Once the throttling starts, it will remain in effect until cpu_throttle= _stop + * is called. + */ +void cpu_throttle_set(int new_throttle_pct); + +/** + * cpu_throttle_stop: + * + * Stops the vcpu throttling started by cpu_throttle_set. + */ +void cpu_throttle_stop(void); + +/** + * cpu_throttle_active: + * + * Returns: %true if the vcpus are currently being throttled, %false other= wise. + */ +bool cpu_throttle_active(void); + +/** + * cpu_throttle_get_percentage: + * + * Returns the vcpu throttle percentage. See cpu_throttle_set for details. + * + * Returns: The throttle percentage in range 1 to 99. + */ +int cpu_throttle_get_percentage(void); + +#endif /* SYSEMU_CPU_THROTTLE_H */ diff --git a/migration/migration.c b/migration/migration.c index 0bb042a0f7..dd18323f32 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -23,6 +23,7 @@ #include "socket.h" #include "sysemu/runstate.h" #include "sysemu/sysemu.h" +#include "sysemu/cpu-throttle.h" #include "rdma.h" #include "ram.h" #include "migration/global_state.h" diff --git a/migration/ram.c b/migration/ram.c index 859f835f1a..527f0c7316 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -52,6 +52,7 @@ #include "migration/colo.h" #include "block.h" #include "sysemu/sysemu.h" +#include "sysemu/cpu-throttle.h" #include "savevm.h" #include "qemu/iov.h" #include "multifd.h" --=20 2.16.4 From nobody Thu Apr 25 09:48:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1590087479; cv=none; d=zohomail.com; s=zohoarc; b=XBLeEFL0mXwidKyrDgG1k0AN27hnrcMiXDpRrCO1t9TXJMuVBHkW7GJHxyl8f2mJxTTGO1S1EIXoRjQ6YQhRA3WJYo7lXFW9Cu1yRsZ5ikNAczgIFPOK+WZdmOFdE5H+mwDv8T9005w9w5zj7DE/0dJljO+VoLBup24XYZ1SmnU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1590087479; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To; bh=FjO1AI1p6eVoeErlPRsLi80I804mR21m8ae4DI+Zrn0=; b=JBbdM0q9NvgEjruysrDBJRD4tqmFEJy8EYt2YY8YcfxbKyy3wMhahI7MW70PZ8hpnDaVbwbyyTfwdxX6aiIzVvXTPZoNr7r+8bfWq5tBNFg7NXgBY/2nQykdfzI3i3Mh/KEEqwR3HqggFv3YslujtB+8hKPVBIeq+KLq0ysaKC8= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1590087479333785.8734410125821; Thu, 21 May 2020 11:57:59 -0700 (PDT) Received: from localhost ([::1]:49566 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jbqOD-0005zF-M0 for importer@patchew.org; Thu, 21 May 2020 14:57:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35320) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbqKj-00017N-2e for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:21 -0400 Received: from mx2.suse.de ([195.135.220.15]:49916) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbqKd-00041k-4k for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:20 -0400 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 51FDCB019; Thu, 21 May 2020 18:54:15 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de From: Claudio Fontana To: Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Peter Maydell , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [RFC 2/3] cpu-timers: new module extracted from cpus.c Date: Thu, 21 May 2020 20:54:06 +0200 Message-Id: <20200521185407.25311-3-cfontana@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200521185407.25311-1-cfontana@suse.de> References: <20200521185407.25311-1-cfontana@suse.de> Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=195.135.220.15; envelope-from=cfontana@suse.de; helo=mx2.suse.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/21 02:02:44 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x (no timestamps) [generic] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Thomas Huth , Eduardo Habkost , Marcelo Tosatti , "open list:All patches CC here" , Roman Bolshakov , Wenchao Wang , Colin Xu , Claudio Fontana , "open list:X86 HAXM CPUs" , Sunil Muthuswamy , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Claudio Fontana --- MAINTAINERS | 1 + Makefile.target | 1 + accel/qtest.c | 3 +- accel/tcg/cpu-exec.c | 43 ++- accel/tcg/tcg-all.c | 7 +- accel/tcg/translate-all.c | 3 +- cpu-timers.c | 776 ++++++++++++++++++++++++++++++++++++++++= ++++ cpus.c | 731 +---------------------------------------- docs/replay.txt | 6 +- exec.c | 4 - hw/core/ptimer.c | 6 +- hw/i386/x86.c | 1 + include/exec/cpu-all.h | 4 + include/exec/exec-all.h | 4 +- include/qemu/timer.h | 20 -- include/sysemu/cpu-timers.h | 73 +++++ include/sysemu/cpus.h | 12 +- include/sysemu/replay.h | 4 +- qtest.c | 2 +- replay/replay.c | 6 +- softmmu/vl.c | 8 +- stubs/clock-warp.c | 4 +- stubs/cpu-get-clock.c | 2 +- stubs/cpu-get-icount.c | 14 +- target/alpha/translate.c | 3 +- target/arm/helper.c | 7 +- target/riscv/csr.c | 8 +- tests/ptimer-test-stubs.c | 6 + tests/test-timed-average.c | 2 +- util/main-loop.c | 4 +- util/qemu-timer.c | 9 +- 31 files changed, 965 insertions(+), 809 deletions(-) create mode 100644 cpu-timers.c create mode 100644 include/sysemu/cpu-timers.h diff --git a/MAINTAINERS b/MAINTAINERS index 35864a275a..1b3b17fda8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2143,6 +2143,7 @@ M: Paolo Bonzini S: Maintained F: cpus.c F: cpu-throttle.c +F: cpu-timers.c F: include/qemu/main-loop.h F: include/sysemu/runstate.h F: util/main-loop.c diff --git a/Makefile.target b/Makefile.target index 60cfa2a78b..1d40237375 100644 --- a/Makefile.target +++ b/Makefile.target @@ -155,6 +155,7 @@ ifdef CONFIG_SOFTMMU obj-y +=3D arch_init.o obj-y +=3D cpus.o obj-y +=3D cpu-throttle.o +obj-y +=3D cpu-timers.o obj-y +=3D gdbstub.o obj-y +=3D balloon.o obj-y +=3D ioport.o diff --git a/accel/qtest.c b/accel/qtest.c index 5b88f55921..ef9ee0941a 100644 --- a/accel/qtest.c +++ b/accel/qtest.c @@ -19,13 +19,14 @@ #include "sysemu/accel.h" #include "sysemu/qtest.h" #include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" =20 static int qtest_init_accel(MachineState *ms) { QemuOpts *opts =3D qemu_opts_create(qemu_find_opts("icount"), NULL, 0, &error_abort); qemu_opt_set(opts, "shift", "0", &error_abort); - configure_icount(opts, &error_abort); + icount_configure(opts, &error_abort); qemu_opts_del(opts); return 0; } diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index d95c4848a4..82155c1db3 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -19,6 +19,7 @@ =20 #include "qemu/osdep.h" #include "qemu-common.h" +#include "qemu/qemu-print.h" #include "cpu.h" #include "trace.h" #include "disas/disas.h" @@ -36,6 +37,8 @@ #include "hw/i386/apic.h" #endif #include "sysemu/cpus.h" +#include "exec/cpu-all.h" +#include "sysemu/cpu-timers.h" #include "sysemu/replay.h" =20 /* -icount align implementation. */ @@ -56,6 +59,9 @@ typedef struct SyncClocks { #define MAX_DELAY_PRINT_RATE 2000000000LL #define MAX_NB_PRINTS 100 =20 +static int64_t max_delay; +static int64_t max_advance; + static void align_clocks(SyncClocks *sc, CPUState *cpu) { int64_t cpu_icount; @@ -65,7 +71,7 @@ static void align_clocks(SyncClocks *sc, CPUState *cpu) } =20 cpu_icount =3D cpu->icount_extra + cpu_neg(cpu)->icount_decr.u16.low; - sc->diff_clk +=3D cpu_icount_to_ns(sc->last_cpu_icount - cpu_icount); + sc->diff_clk +=3D icount_to_ns(sc->last_cpu_icount - cpu_icount); sc->last_cpu_icount =3D cpu_icount; =20 if (sc->diff_clk > VM_CLOCK_ADVANCE) { @@ -98,9 +104,9 @@ static void print_delay(const SyncClocks *sc) (-sc->diff_clk / (float)1000000000LL < (threshold_delay - THRESHOLD_REDUCE))) { threshold_delay =3D (-sc->diff_clk / 1000000000LL) + 1; - printf("Warning: The guest is now late by %.1f to %.1f seconds= \n", - threshold_delay - 1, - threshold_delay); + qemu_printf("Warning: The guest is now late by %.1f to %.1f se= conds\n", + threshold_delay - 1, + threshold_delay); nb_prints++; last_realtime_clock =3D sc->realtime_clock; } @@ -597,7 +603,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu, =20 /* Finally, check if we need to exit to the main loop. */ if (unlikely(atomic_read(&cpu->exit_request)) - || (use_icount + || (icount_enabled() && cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra =3D= =3D 0)) { atomic_set(&cpu->exit_request, 0); if (cpu->exception_index =3D=3D -1) { @@ -638,10 +644,10 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, Tr= anslationBlock *tb, } =20 /* Instruction counter expired. */ - assert(use_icount); + assert(icount_enabled()); #ifndef CONFIG_USER_ONLY /* Ensure global icount has gone forward */ - cpu_update_icount(cpu); + icount_update(cpu); /* Refill decrementer and continue execution. */ insns_left =3D MIN(0xffff, cpu->icount_budget); cpu_neg(cpu)->icount_decr.u16.low =3D insns_left; @@ -741,3 +747,26 @@ int cpu_exec(CPUState *cpu) =20 return ret; } + +#ifndef CONFIG_USER_ONLY + +void dump_drift_info(void) +{ + if (!icount_enabled()) { + return; + } + + qemu_printf("Host - Guest clock %"PRIi64" ms\n", + (cpu_get_clock() - icount_get()) / SCALE_MS); + if (icount_align_option) { + qemu_printf("Max guest delay %"PRIi64" ms\n", + -max_delay / SCALE_MS); + qemu_printf("Max guest advance %"PRIi64" ms\n", + max_advance / SCALE_MS); + } else { + qemu_printf("Max guest delay NA\n"); + qemu_printf("Max guest advance NA\n"); + } +} + +#endif /* !CONFIG_USER_ONLY */ diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c index 3b4fda5640..e27385d051 100644 --- a/accel/tcg/tcg-all.c +++ b/accel/tcg/tcg-all.c @@ -29,6 +29,7 @@ #include "qom/object.h" #include "cpu.h" #include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" #include "qemu/main-loop.h" #include "tcg/tcg.h" #include "qapi/error.h" @@ -65,7 +66,7 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask) qemu_cpu_kick(cpu); } else { atomic_set(&cpu_neg(cpu)->icount_decr.u16.high, -1); - if (use_icount && + if (icount_enabled() && !cpu->can_do_io && (mask & ~old_mask) !=3D 0) { cpu_abort(cpu, "Raised interrupt while not in I/O function"); @@ -104,7 +105,7 @@ static bool check_tcg_memory_orders_compatible(void) =20 static bool default_mttcg_enabled(void) { - if (use_icount || TCG_OVERSIZED_GUEST) { + if (icount_enabled() || TCG_OVERSIZED_GUEST) { return false; } else { #ifdef TARGET_SUPPORTS_MTTCG @@ -146,7 +147,7 @@ static void tcg_set_thread(Object *obj, const char *val= ue, Error **errp) if (strcmp(value, "multi") =3D=3D 0) { if (TCG_OVERSIZED_GUEST) { error_setg(errp, "No MTTCG when guest word size > hosts"); - } else if (use_icount) { + } else if (icount_enabled()) { error_setg(errp, "No MTTCG when icount is enabled"); } else { #ifndef TARGET_SUPPORTS_MTTCG diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c index 42ce1dfcff..479edeb2ea 100644 --- a/accel/tcg/translate-all.c +++ b/accel/tcg/translate-all.c @@ -57,6 +57,7 @@ #include "qemu/main-loop.h" #include "exec/log.h" #include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" #include "sysemu/tcg.h" =20 /* #define DEBUG_TB_INVALIDATE */ @@ -369,7 +370,7 @@ static int cpu_restore_state_from_tb(CPUState *cpu, Tra= nslationBlock *tb, =20 found: if (reset_icount && (tb_cflags(tb) & CF_USE_ICOUNT)) { - assert(use_icount); + assert(icount_enabled()); /* Reset the cycle counter to the start of the block and shift if to the number of actually executed instructions */ cpu_neg(cpu)->icount_decr.u16.low +=3D num_insns - i; diff --git a/cpu-timers.c b/cpu-timers.c new file mode 100644 index 0000000000..20fea07625 --- /dev/null +++ b/cpu-timers.c @@ -0,0 +1,776 @@ +/* + * QEMU System Emulator + * + * Copyright (c) 2003-2008 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a= copy + * of this software and associated documentation files (the "Software"), t= o deal + * in the Software without restriction, including without limitation the r= ights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or se= ll + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included= in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS= OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OT= HER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING= FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS = IN + * THE SOFTWARE. + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" +#include "qemu/cutils.h" +#include "migration/vmstate.h" +#include "qapi/error.h" +#include "qemu/error-report.h" +#include "exec/exec-all.h" +#include "sysemu/cpus.h" +#include "sysemu/qtest.h" +#include "qemu/main-loop.h" +#include "qemu/option.h" +#include "qemu/seqlock.h" +#include "sysemu/replay.h" +#include "sysemu/runstate.h" +#include "hw/core/cpu.h" +#include "sysemu/cpu-timers.h" +#include "sysemu/cpu-throttle.h" + +typedef struct TimersState { + /* Protected by BQL. */ + int64_t cpu_ticks_prev; + int64_t cpu_ticks_offset; + + /* + * Protect fields that can be respectively read outside the + * BQL, and written from multiple threads. + */ + QemuSeqLock vm_clock_seqlock; + QemuSpin vm_clock_lock; + + int16_t cpu_ticks_enabled; + + /* Conversion factor from emulated instructions to virtual clock ticks= . */ + int16_t icount_time_shift; + + /* Compensate for varying guest execution speed. */ + int64_t qemu_icount_bias; + + int64_t vm_clock_warp_start; + int64_t cpu_clock_offset; + + /* Only written by TCG thread */ + int64_t qemu_icount; + + /* for adjusting icount */ + QEMUTimer *icount_rt_timer; + QEMUTimer *icount_vm_timer; + QEMUTimer *icount_warp_timer; +} TimersState; + +static TimersState timers_state; + +/* + * ICOUNT: Instruction Counter + */ +static bool icount_sleep =3D true; +/* Arbitrarily pick 1MIPS as the minimum allowable speed. */ +#define MAX_ICOUNT_SHIFT 10 + +/* + * 0 =3D Do not count executed instructions. + * 1 =3D Fixed conversion of insn to ns via "shift" option + * 2 =3D Runtime adaptive algorithm to compute shift + */ +static int use_icount; + +int icount_enabled(void) +{ + return use_icount; +} + +static void icount_enable_precise(void) +{ + use_icount =3D 1; +} + +static void icount_enable_adaptive(void) +{ + use_icount =3D 2; +} + +/* + * The current number of executed instructions is based on what we + * originally budgeted minus the current state of the decrementing + * icount counters in extra/u16.low. + */ +static int64_t icount_get_executed(CPUState *cpu) +{ + return (cpu->icount_budget - + (cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra)); +} + +/* + * Update the global shared timer_state.qemu_icount to take into + * account executed instructions. This is done by the TCG vCPU + * thread so the main-loop can see time has moved forward. + */ +static void icount_update_locked(CPUState *cpu) +{ + int64_t executed =3D icount_get_executed(cpu); + cpu->icount_budget -=3D executed; + + atomic_set_i64(&timers_state.qemu_icount, + timers_state.qemu_icount + executed); +} + +/* + * Update the global shared timer_state.qemu_icount to take into + * account executed instructions. This is done by the TCG vCPU + * thread so the main-loop can see time has moved forward. + */ +void icount_update(CPUState *cpu) +{ + seqlock_write_lock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + icount_update_locked(cpu); + seqlock_write_unlock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); +} + +static int64_t icount_get_raw_locked(void) +{ + CPUState *cpu =3D current_cpu; + + if (cpu && cpu->running) { + if (!cpu->can_do_io) { + error_report("Bad icount read"); + exit(1); + } + /* Take into account what has run */ + icount_update_locked(cpu); + } + /* The read is protected by the seqlock, but needs atomic64 to avoid U= B */ + return atomic_read_i64(&timers_state.qemu_icount); +} + +static int64_t icount_get_locked(void) +{ + int64_t icount =3D icount_get_raw_locked(); + return atomic_read_i64(&timers_state.qemu_icount_bias) + + icount_to_ns(icount); +} + +int64_t icount_get_raw(void) +{ + int64_t icount; + unsigned start; + + do { + start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock); + icount =3D icount_get_raw_locked(); + } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start)); + + return icount; +} + +/* Return the virtual CPU time, based on the instruction counter. */ +int64_t icount_get(void) +{ + int64_t icount; + unsigned start; + + do { + start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock); + icount =3D icount_get_locked(); + } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start)); + + return icount; +} + +int64_t icount_to_ns(int64_t icount) +{ + return icount << atomic_read(&timers_state.icount_time_shift); +} + +/* + * Correlation between real and virtual time is always going to be + * fairly approximate, so ignore small variation. + * When the guest is idle real and virtual time will be aligned in + * the IO wait loop. + */ +#define ICOUNT_WOBBLE (NANOSECONDS_PER_SECOND / 10) + +static int64_t cpu_get_clock_locked(void); + +static void icount_adjust(void) +{ + int64_t cur_time; + int64_t cur_icount; + int64_t delta; + + /* Protected by TimersState mutex. */ + static int64_t last_delta; + + /* If the VM is not running, then do nothing. */ + if (!runstate_is_running()) { + return; + } + + seqlock_write_lock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + cur_time =3D cpu_get_clock_locked(); + cur_icount =3D icount_get_locked(); + + delta =3D cur_icount - cur_time; + /* FIXME: This is a very crude algorithm, somewhat prone to oscillatio= n. */ + if (delta > 0 + && last_delta + ICOUNT_WOBBLE < delta * 2 + && timers_state.icount_time_shift > 0) { + /* The guest is getting too far ahead. Slow time down. */ + atomic_set(&timers_state.icount_time_shift, + timers_state.icount_time_shift - 1); + } + if (delta < 0 + && last_delta - ICOUNT_WOBBLE > delta * 2 + && timers_state.icount_time_shift < MAX_ICOUNT_SHIFT) { + /* The guest is getting too far behind. Speed time up. */ + atomic_set(&timers_state.icount_time_shift, + timers_state.icount_time_shift + 1); + } + last_delta =3D delta; + atomic_set_i64(&timers_state.qemu_icount_bias, + cur_icount - (timers_state.qemu_icount + << timers_state.icount_time_shift)); + seqlock_write_unlock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); +} + +static void icount_adjust_rt(void *opaque) +{ + timer_mod(timers_state.icount_rt_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) + 1000); + icount_adjust(); +} + +static void icount_adjust_vm(void *opaque) +{ + timer_mod(timers_state.icount_vm_timer, + qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + + NANOSECONDS_PER_SECOND / 10); + icount_adjust(); +} + +int64_t icount_round(int64_t count) +{ + int shift =3D atomic_read(&timers_state.icount_time_shift); + return (count + (1 << shift) - 1) >> shift; +} + +static void icount_warp_rt(void) +{ + unsigned seq; + int64_t warp_start; + + /* + * The icount_warp_timer is rescheduled soon after vm_clock_warp_start + * changes from -1 to another value, so the race here is okay. + */ + do { + seq =3D seqlock_read_begin(&timers_state.vm_clock_seqlock); + warp_start =3D timers_state.vm_clock_warp_start; + } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, seq)); + + if (warp_start =3D=3D -1) { + return; + } + + seqlock_write_lock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + if (runstate_is_running()) { + int64_t clock =3D REPLAY_CLOCK_LOCKED(REPLAY_CLOCK_VIRTUAL_RT, + cpu_get_clock_locked()); + int64_t warp_delta; + + warp_delta =3D clock - timers_state.vm_clock_warp_start; + if (icount_enabled() =3D=3D 2) { + /* + * In adaptive mode, do not let QEMU_CLOCK_VIRTUAL run too + * far ahead of real time. + */ + int64_t cur_icount =3D icount_get_locked(); + int64_t delta =3D clock - cur_icount; + warp_delta =3D MIN(warp_delta, delta); + } + atomic_set_i64(&timers_state.qemu_icount_bias, + timers_state.qemu_icount_bias + warp_delta); + } + timers_state.vm_clock_warp_start =3D -1; + seqlock_write_unlock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + + if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) { + qemu_clock_notify(QEMU_CLOCK_VIRTUAL); + } +} + +static void icount_timer_cb(void *opaque) +{ + /* + * No need for a checkpoint because the timer already synchronizes + * with CHECKPOINT_CLOCK_VIRTUAL_RT. + */ + icount_warp_rt(); +} + +void icount_start_warp_timer(void) +{ + int64_t clock; + int64_t deadline; + + if (!icount_enabled()) { + return; + } + + /* + * Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers + * do not fire, so computing the deadline does not make sense. + */ + if (!runstate_is_running()) { + return; + } + + if (replay_mode !=3D REPLAY_MODE_PLAY) { + if (!all_cpu_threads_idle()) { + return; + } + + if (qtest_enabled()) { + /* When testing, qtest commands advance icount. */ + return; + } + + replay_checkpoint(CHECKPOINT_CLOCK_WARP_START); + } else { + /* warp clock deterministically in record/replay mode */ + if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_START)) { + /* + * vCPU is sleeping and warp can't be started. + * It is probably a race condition: notification sent + * to vCPU was processed in advance and vCPU went to sleep. + * Therefore we have to wake it up for doing someting. + */ + if (replay_has_checkpoint()) { + qemu_clock_notify(QEMU_CLOCK_VIRTUAL); + } + return; + } + } + + /* We want to use the earliest deadline from ALL vm_clocks */ + clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT); + deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, + ~QEMU_TIMER_ATTR_EXTERNAL); + if (deadline < 0) { + static bool notified; + if (!icount_sleep && !notified) { + warn_report("icount sleep disabled and no active timers"); + notified =3D true; + } + return; + } + + if (deadline > 0) { + /* + * Ensure QEMU_CLOCK_VIRTUAL proceeds even when the virtual CPU go= es to + * sleep. Otherwise, the CPU might be waiting for a future timer + * interrupt to wake it up, but the interrupt never comes because + * the vCPU isn't running any insns and thus doesn't advance the + * QEMU_CLOCK_VIRTUAL. + */ + if (!icount_sleep) { + /* + * We never let VCPUs sleep in no sleep icount mode. + * If there is a pending QEMU_CLOCK_VIRTUAL timer we just adva= nce + * to the next QEMU_CLOCK_VIRTUAL event and notify it. + * It is useful when we want a deterministic execution time, + * isolated from host latencies. + */ + seqlock_write_lock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + atomic_set_i64(&timers_state.qemu_icount_bias, + timers_state.qemu_icount_bias + deadline); + seqlock_write_unlock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + qemu_clock_notify(QEMU_CLOCK_VIRTUAL); + } else { + /* + * We do stop VCPUs and only advance QEMU_CLOCK_VIRTUAL after = some + * "real" time, (related to the time left until the next event= ) has + * passed. The QEMU_CLOCK_VIRTUAL_RT clock will do this. + * This avoids that the warps are visible externally; for exam= ple, + * you will not be sending network packets continuously instea= d of + * every 100ms. + */ + seqlock_write_lock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + if (timers_state.vm_clock_warp_start =3D=3D -1 + || timers_state.vm_clock_warp_start > clock) { + timers_state.vm_clock_warp_start =3D clock; + } + seqlock_write_unlock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + timer_mod_anticipate(timers_state.icount_warp_timer, + clock + deadline); + } + } else if (deadline =3D=3D 0) { + qemu_clock_notify(QEMU_CLOCK_VIRTUAL); + } +} + +void icount_account_warp_timer(void) +{ + if (!use_icount || !icount_sleep) { + return; + } + + /* + * Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers + * do not fire, so computing the deadline does not make sense. + */ + if (!runstate_is_running()) { + return; + } + + /* warp clock deterministically in record/replay mode */ + if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_ACCOUNT)) { + return; + } + + timer_del(timers_state.icount_warp_timer); + icount_warp_rt(); +} + +void icount_configure(QemuOpts *opts, Error **errp) +{ + const char *option =3D qemu_opt_get(opts, "shift"); + bool sleep =3D qemu_opt_get_bool(opts, "sleep", true); + bool align =3D qemu_opt_get_bool(opts, "align", false); + long time_shift =3D -1; + + if (!option && qemu_opt_get(opts, "align")) { + error_setg(errp, "Please specify shift option when using align"); + return; + } + + if (align && !sleep) { + error_setg(errp, "align=3Don and sleep=3Doff are incompatible"); + return; + } + + if (strcmp(option, "auto") !=3D 0) { + if (qemu_strtol(option, NULL, 0, &time_shift) < 0 + || time_shift < 0 || time_shift > MAX_ICOUNT_SHIFT) { + error_setg(errp, "icount: Invalid shift value"); + return; + } + } else if (icount_align_option) { + error_setg(errp, "shift=3Dauto and align=3Don are incompatible"); + return; + } else if (!icount_sleep) { + error_setg(errp, "shift=3Dauto and sleep=3Doff are incompatible"); + return; + } + + icount_sleep =3D sleep; + if (icount_sleep) { + timers_state.icount_warp_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL= _RT, + icount_timer_cb, NULL); + } + + icount_align_option =3D align; + + if (time_shift >=3D 0) { + timers_state.icount_time_shift =3D time_shift; + icount_enable_precise(); + return; + } + + icount_enable_adaptive(); + + /* + * 125MIPS seems a reasonable initial guess at the guest speed. + * It will be corrected fairly quickly anyway. + */ + timers_state.icount_time_shift =3D 3; + + /* + * Have both realtime and virtual time triggers for speed adjustment. + * The realtime trigger catches emulated time passing too slowly, + * the virtual time trigger catches emulated time passing too fast. + * Realtime triggers occur even when idle, so use them less frequently + * than VM triggers. + */ + timers_state.vm_clock_warp_start =3D -1; + timers_state.icount_rt_timer =3D timer_new_ms(QEMU_CLOCK_VIRTUAL_RT, + icount_adjust_rt, NULL); + timer_mod(timers_state.icount_rt_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) + 1000); + timers_state.icount_vm_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL, + icount_adjust_vm, NULL); + timer_mod(timers_state.icount_vm_timer, + qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + + NANOSECONDS_PER_SECOND / 10); +} + +/* clock and ticks */ + +static int64_t cpu_get_ticks_locked(void) +{ + int64_t ticks =3D timers_state.cpu_ticks_offset; + if (timers_state.cpu_ticks_enabled) { + ticks +=3D cpu_get_host_ticks(); + } + + if (timers_state.cpu_ticks_prev > ticks) { + /* Non increasing ticks may happen if the host uses software suspe= nd. */ + timers_state.cpu_ticks_offset +=3D timers_state.cpu_ticks_prev - t= icks; + ticks =3D timers_state.cpu_ticks_prev; + } + + timers_state.cpu_ticks_prev =3D ticks; + return ticks; +} + +/* + * return the time elapsed in VM between vm_start and vm_stop. Unless + * icount is active, cpu_get_ticks() uses units of the host CPU cycle + * counter. + */ +int64_t cpu_get_ticks(void) +{ + int64_t ticks; + + if (icount_enabled()) { + return icount_get(); + } + + qemu_spin_lock(&timers_state.vm_clock_lock); + ticks =3D cpu_get_ticks_locked(); + qemu_spin_unlock(&timers_state.vm_clock_lock); + return ticks; +} + +static int64_t cpu_get_clock_locked(void) +{ + int64_t time; + + time =3D timers_state.cpu_clock_offset; + if (timers_state.cpu_ticks_enabled) { + time +=3D get_clock(); + } + + return time; +} + +/* + * Return the monotonic time elapsed in VM, i.e., + * the time between vm_start and vm_stop + */ +int64_t cpu_get_clock(void) +{ + int64_t ti; + unsigned start; + + do { + start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock); + ti =3D cpu_get_clock_locked(); + } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start)); + + return ti; +} + +/* + * enable cpu_get_ticks() + * Caller must hold BQL which serves as mutex for vm_clock_seqlock. + */ +void cpu_enable_ticks(void) +{ + seqlock_write_lock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + if (!timers_state.cpu_ticks_enabled) { + timers_state.cpu_ticks_offset -=3D cpu_get_host_ticks(); + timers_state.cpu_clock_offset -=3D get_clock(); + timers_state.cpu_ticks_enabled =3D 1; + } + seqlock_write_unlock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); +} + +/* + * disable cpu_get_ticks() : the clock is stopped. You must not call + * cpu_get_ticks() after that. + * Caller must hold BQL which serves as mutex for vm_clock_seqlock. + */ +void cpu_disable_ticks(void) +{ + seqlock_write_lock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + if (timers_state.cpu_ticks_enabled) { + timers_state.cpu_ticks_offset +=3D cpu_get_host_ticks(); + timers_state.cpu_clock_offset =3D cpu_get_clock_locked(); + timers_state.cpu_ticks_enabled =3D 0; + } + seqlock_write_unlock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); +} + +void qtest_clock_warp(int64_t dest) +{ + int64_t clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); + AioContext *aio_context; + assert(qtest_enabled()); + aio_context =3D qemu_get_aio_context(); + while (clock < dest) { + int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, + QEMU_TIMER_ATTR_ALL); + int64_t warp =3D qemu_soonest_timeout(dest - clock, deadline); + + seqlock_write_lock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + atomic_set_i64(&timers_state.qemu_icount_bias, + timers_state.qemu_icount_bias + warp); + seqlock_write_unlock(&timers_state.vm_clock_seqlock, + &timers_state.vm_clock_lock); + + qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL); + timerlist_run_timers(aio_context->tlg.tl[QEMU_CLOCK_VIRTUAL]); + clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); + } + qemu_clock_notify(QEMU_CLOCK_VIRTUAL); +} + +static bool icount_state_needed(void *opaque) +{ + return icount_enabled(); +} + +static bool warp_timer_state_needed(void *opaque) +{ + TimersState *s =3D opaque; + return s->icount_warp_timer !=3D NULL; +} + +static bool adjust_timers_state_needed(void *opaque) +{ + TimersState *s =3D opaque; + return s->icount_rt_timer !=3D NULL; +} + +/* + * Subsection for warp timer migration is optional, because may not be cre= ated + */ +static const VMStateDescription icount_vmstate_warp_timer =3D { + .name =3D "timer/icount/warp_timer", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D warp_timer_state_needed, + .fields =3D (VMStateField[]) { + VMSTATE_INT64(vm_clock_warp_start, TimersState), + VMSTATE_TIMER_PTR(icount_warp_timer, TimersState), + VMSTATE_END_OF_LIST() + } +}; + +static const VMStateDescription icount_vmstate_adjust_timers =3D { + .name =3D "timer/icount/timers", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D adjust_timers_state_needed, + .fields =3D (VMStateField[]) { + VMSTATE_TIMER_PTR(icount_rt_timer, TimersState), + VMSTATE_TIMER_PTR(icount_vm_timer, TimersState), + VMSTATE_END_OF_LIST() + } +}; + +/* + * This is a subsection for icount migration. + */ +static const VMStateDescription icount_vmstate_timers =3D { + .name =3D "timer/icount", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D icount_state_needed, + .fields =3D (VMStateField[]) { + VMSTATE_INT64(qemu_icount_bias, TimersState), + VMSTATE_INT64(qemu_icount, TimersState), + VMSTATE_END_OF_LIST() + }, + .subsections =3D (const VMStateDescription * []) { + &icount_vmstate_warp_timer, + &icount_vmstate_adjust_timers, + NULL + } +}; + +static const VMStateDescription vmstate_timers =3D { + .name =3D "timer", + .version_id =3D 2, + .minimum_version_id =3D 1, + .fields =3D (VMStateField[]) { + VMSTATE_INT64(cpu_ticks_offset, TimersState), + VMSTATE_UNUSED(8), + VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2), + VMSTATE_END_OF_LIST() + }, + .subsections =3D (const VMStateDescription * []) { + &icount_vmstate_timers, + NULL + } +}; + +static void do_nothing(CPUState *cpu, run_on_cpu_data unused) +{ +} + +void qemu_timer_notify_cb(void *opaque, QEMUClockType type) +{ + if (!icount_enabled() || type !=3D QEMU_CLOCK_VIRTUAL) { + qemu_notify_event(); + return; + } + + if (qemu_in_vcpu_thread()) { + /* + * A CPU is currently running; kick it back out to the + * tcg_cpu_exec() loop so it will recalculate its + * icount deadline immediately. + */ + qemu_cpu_kick(current_cpu); + } else if (first_cpu) { + /* + * qemu_cpu_kick is not enough to kick a halted CPU out of + * qemu_tcg_wait_io_event. async_run_on_cpu, instead, + * causes cpu_thread_is_idle to return false. This way, + * handle_icount_deadline can run. + * If we have no CPUs at all for some reason, we don't + * need to do anything. + */ + async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL); + } +} + +/* initialize this module and the cpu throttle for convenience as well */ +void cpu_timers_init(void) +{ + seqlock_init(&timers_state.vm_clock_seqlock); + qemu_spin_init(&timers_state.vm_clock_lock); + vmstate_register(NULL, 0, &vmstate_timers, &timers_state); + + cpu_throttle_init(); +} diff --git a/cpus.c b/cpus.c index 3a46a4fc2b..7e9f545be8 100644 --- a/cpus.c +++ b/cpus.c @@ -58,11 +58,10 @@ #include "hw/nmi.h" #include "sysemu/replay.h" #include "sysemu/runstate.h" +#include "sysemu/cpu-timers.h" #include "hw/boards.h" #include "hw/hw.h" =20 -#include "sysemu/cpu-throttle.h" - #ifdef CONFIG_LINUX =20 #include @@ -83,9 +82,6 @@ =20 static QemuMutex qemu_global_mutex; =20 -int64_t max_delay; -int64_t max_advance; - bool cpu_is_stopped(CPUState *cpu) { return cpu->stopped || !runstate_is_running(); @@ -106,7 +102,7 @@ static bool cpu_thread_is_idle(CPUState *cpu) return true; } =20 -static bool all_cpu_threads_idle(void) +bool all_cpu_threads_idle(void) { CPUState *cpu; =20 @@ -118,668 +114,8 @@ static bool all_cpu_threads_idle(void) return true; } =20 -/***********************************************************/ -/* guest cycle counter */ - -/* Protected by TimersState seqlock */ - -static bool icount_sleep =3D true; -/* Arbitrarily pick 1MIPS as the minimum allowable speed. */ -#define MAX_ICOUNT_SHIFT 10 - -typedef struct TimersState { - /* Protected by BQL. */ - int64_t cpu_ticks_prev; - int64_t cpu_ticks_offset; - - /* Protect fields that can be respectively read outside the - * BQL, and written from multiple threads. - */ - QemuSeqLock vm_clock_seqlock; - QemuSpin vm_clock_lock; - - int16_t cpu_ticks_enabled; - - /* Conversion factor from emulated instructions to virtual clock ticks= . */ - int16_t icount_time_shift; - - /* Compensate for varying guest execution speed. */ - int64_t qemu_icount_bias; - - int64_t vm_clock_warp_start; - int64_t cpu_clock_offset; - - /* Only written by TCG thread */ - int64_t qemu_icount; - - /* for adjusting icount */ - QEMUTimer *icount_rt_timer; - QEMUTimer *icount_vm_timer; - QEMUTimer *icount_warp_timer; -} TimersState; - -static TimersState timers_state; bool mttcg_enabled; =20 - -/* The current number of executed instructions is based on what we - * originally budgeted minus the current state of the decrementing - * icount counters in extra/u16.low. - */ -static int64_t cpu_get_icount_executed(CPUState *cpu) -{ - return (cpu->icount_budget - - (cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra)); -} - -/* - * Update the global shared timer_state.qemu_icount to take into - * account executed instructions. This is done by the TCG vCPU - * thread so the main-loop can see time has moved forward. - */ -static void cpu_update_icount_locked(CPUState *cpu) -{ - int64_t executed =3D cpu_get_icount_executed(cpu); - cpu->icount_budget -=3D executed; - - atomic_set_i64(&timers_state.qemu_icount, - timers_state.qemu_icount + executed); -} - -/* - * Update the global shared timer_state.qemu_icount to take into - * account executed instructions. This is done by the TCG vCPU - * thread so the main-loop can see time has moved forward. - */ -void cpu_update_icount(CPUState *cpu) -{ - seqlock_write_lock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - cpu_update_icount_locked(cpu); - seqlock_write_unlock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); -} - -static int64_t cpu_get_icount_raw_locked(void) -{ - CPUState *cpu =3D current_cpu; - - if (cpu && cpu->running) { - if (!cpu->can_do_io) { - error_report("Bad icount read"); - exit(1); - } - /* Take into account what has run */ - cpu_update_icount_locked(cpu); - } - /* The read is protected by the seqlock, but needs atomic64 to avoid U= B */ - return atomic_read_i64(&timers_state.qemu_icount); -} - -static int64_t cpu_get_icount_locked(void) -{ - int64_t icount =3D cpu_get_icount_raw_locked(); - return atomic_read_i64(&timers_state.qemu_icount_bias) + - cpu_icount_to_ns(icount); -} - -int64_t cpu_get_icount_raw(void) -{ - int64_t icount; - unsigned start; - - do { - start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock); - icount =3D cpu_get_icount_raw_locked(); - } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start)); - - return icount; -} - -/* Return the virtual CPU time, based on the instruction counter. */ -int64_t cpu_get_icount(void) -{ - int64_t icount; - unsigned start; - - do { - start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock); - icount =3D cpu_get_icount_locked(); - } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start)); - - return icount; -} - -int64_t cpu_icount_to_ns(int64_t icount) -{ - return icount << atomic_read(&timers_state.icount_time_shift); -} - -static int64_t cpu_get_ticks_locked(void) -{ - int64_t ticks =3D timers_state.cpu_ticks_offset; - if (timers_state.cpu_ticks_enabled) { - ticks +=3D cpu_get_host_ticks(); - } - - if (timers_state.cpu_ticks_prev > ticks) { - /* Non increasing ticks may happen if the host uses software suspe= nd. */ - timers_state.cpu_ticks_offset +=3D timers_state.cpu_ticks_prev - t= icks; - ticks =3D timers_state.cpu_ticks_prev; - } - - timers_state.cpu_ticks_prev =3D ticks; - return ticks; -} - -/* return the time elapsed in VM between vm_start and vm_stop. Unless - * icount is active, cpu_get_ticks() uses units of the host CPU cycle - * counter. - */ -int64_t cpu_get_ticks(void) -{ - int64_t ticks; - - if (use_icount) { - return cpu_get_icount(); - } - - qemu_spin_lock(&timers_state.vm_clock_lock); - ticks =3D cpu_get_ticks_locked(); - qemu_spin_unlock(&timers_state.vm_clock_lock); - return ticks; -} - -static int64_t cpu_get_clock_locked(void) -{ - int64_t time; - - time =3D timers_state.cpu_clock_offset; - if (timers_state.cpu_ticks_enabled) { - time +=3D get_clock(); - } - - return time; -} - -/* Return the monotonic time elapsed in VM, i.e., - * the time between vm_start and vm_stop - */ -int64_t cpu_get_clock(void) -{ - int64_t ti; - unsigned start; - - do { - start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock); - ti =3D cpu_get_clock_locked(); - } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start)); - - return ti; -} - -/* enable cpu_get_ticks() - * Caller must hold BQL which serves as mutex for vm_clock_seqlock. - */ -void cpu_enable_ticks(void) -{ - seqlock_write_lock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - if (!timers_state.cpu_ticks_enabled) { - timers_state.cpu_ticks_offset -=3D cpu_get_host_ticks(); - timers_state.cpu_clock_offset -=3D get_clock(); - timers_state.cpu_ticks_enabled =3D 1; - } - seqlock_write_unlock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); -} - -/* disable cpu_get_ticks() : the clock is stopped. You must not call - * cpu_get_ticks() after that. - * Caller must hold BQL which serves as mutex for vm_clock_seqlock. - */ -void cpu_disable_ticks(void) -{ - seqlock_write_lock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - if (timers_state.cpu_ticks_enabled) { - timers_state.cpu_ticks_offset +=3D cpu_get_host_ticks(); - timers_state.cpu_clock_offset =3D cpu_get_clock_locked(); - timers_state.cpu_ticks_enabled =3D 0; - } - seqlock_write_unlock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); -} - -/* Correlation between real and virtual time is always going to be - fairly approximate, so ignore small variation. - When the guest is idle real and virtual time will be aligned in - the IO wait loop. */ -#define ICOUNT_WOBBLE (NANOSECONDS_PER_SECOND / 10) - -static void icount_adjust(void) -{ - int64_t cur_time; - int64_t cur_icount; - int64_t delta; - - /* Protected by TimersState mutex. */ - static int64_t last_delta; - - /* If the VM is not running, then do nothing. */ - if (!runstate_is_running()) { - return; - } - - seqlock_write_lock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - cur_time =3D cpu_get_clock_locked(); - cur_icount =3D cpu_get_icount_locked(); - - delta =3D cur_icount - cur_time; - /* FIXME: This is a very crude algorithm, somewhat prone to oscillatio= n. */ - if (delta > 0 - && last_delta + ICOUNT_WOBBLE < delta * 2 - && timers_state.icount_time_shift > 0) { - /* The guest is getting too far ahead. Slow time down. */ - atomic_set(&timers_state.icount_time_shift, - timers_state.icount_time_shift - 1); - } - if (delta < 0 - && last_delta - ICOUNT_WOBBLE > delta * 2 - && timers_state.icount_time_shift < MAX_ICOUNT_SHIFT) { - /* The guest is getting too far behind. Speed time up. */ - atomic_set(&timers_state.icount_time_shift, - timers_state.icount_time_shift + 1); - } - last_delta =3D delta; - atomic_set_i64(&timers_state.qemu_icount_bias, - cur_icount - (timers_state.qemu_icount - << timers_state.icount_time_shift)); - seqlock_write_unlock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); -} - -static void icount_adjust_rt(void *opaque) -{ - timer_mod(timers_state.icount_rt_timer, - qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) + 1000); - icount_adjust(); -} - -static void icount_adjust_vm(void *opaque) -{ - timer_mod(timers_state.icount_vm_timer, - qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + - NANOSECONDS_PER_SECOND / 10); - icount_adjust(); -} - -static int64_t qemu_icount_round(int64_t count) -{ - int shift =3D atomic_read(&timers_state.icount_time_shift); - return (count + (1 << shift) - 1) >> shift; -} - -static void icount_warp_rt(void) -{ - unsigned seq; - int64_t warp_start; - - /* The icount_warp_timer is rescheduled soon after vm_clock_warp_start - * changes from -1 to another value, so the race here is okay. - */ - do { - seq =3D seqlock_read_begin(&timers_state.vm_clock_seqlock); - warp_start =3D timers_state.vm_clock_warp_start; - } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, seq)); - - if (warp_start =3D=3D -1) { - return; - } - - seqlock_write_lock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - if (runstate_is_running()) { - int64_t clock =3D REPLAY_CLOCK_LOCKED(REPLAY_CLOCK_VIRTUAL_RT, - cpu_get_clock_locked()); - int64_t warp_delta; - - warp_delta =3D clock - timers_state.vm_clock_warp_start; - if (use_icount =3D=3D 2) { - /* - * In adaptive mode, do not let QEMU_CLOCK_VIRTUAL run too - * far ahead of real time. - */ - int64_t cur_icount =3D cpu_get_icount_locked(); - int64_t delta =3D clock - cur_icount; - warp_delta =3D MIN(warp_delta, delta); - } - atomic_set_i64(&timers_state.qemu_icount_bias, - timers_state.qemu_icount_bias + warp_delta); - } - timers_state.vm_clock_warp_start =3D -1; - seqlock_write_unlock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - - if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) { - qemu_clock_notify(QEMU_CLOCK_VIRTUAL); - } -} - -static void icount_timer_cb(void *opaque) -{ - /* No need for a checkpoint because the timer already synchronizes - * with CHECKPOINT_CLOCK_VIRTUAL_RT. - */ - icount_warp_rt(); -} - -void qtest_clock_warp(int64_t dest) -{ - int64_t clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); - AioContext *aio_context; - assert(qtest_enabled()); - aio_context =3D qemu_get_aio_context(); - while (clock < dest) { - int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, - QEMU_TIMER_ATTR_ALL); - int64_t warp =3D qemu_soonest_timeout(dest - clock, deadline); - - seqlock_write_lock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - atomic_set_i64(&timers_state.qemu_icount_bias, - timers_state.qemu_icount_bias + warp); - seqlock_write_unlock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - - qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL); - timerlist_run_timers(aio_context->tlg.tl[QEMU_CLOCK_VIRTUAL]); - clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); - } - qemu_clock_notify(QEMU_CLOCK_VIRTUAL); -} - -void qemu_start_warp_timer(void) -{ - int64_t clock; - int64_t deadline; - - if (!use_icount) { - return; - } - - /* Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers - * do not fire, so computing the deadline does not make sense. - */ - if (!runstate_is_running()) { - return; - } - - if (replay_mode !=3D REPLAY_MODE_PLAY) { - if (!all_cpu_threads_idle()) { - return; - } - - if (qtest_enabled()) { - /* When testing, qtest commands advance icount. */ - return; - } - - replay_checkpoint(CHECKPOINT_CLOCK_WARP_START); - } else { - /* warp clock deterministically in record/replay mode */ - if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_START)) { - /* vCPU is sleeping and warp can't be started. - It is probably a race condition: notification sent - to vCPU was processed in advance and vCPU went to sleep. - Therefore we have to wake it up for doing someting. */ - if (replay_has_checkpoint()) { - qemu_clock_notify(QEMU_CLOCK_VIRTUAL); - } - return; - } - } - - /* We want to use the earliest deadline from ALL vm_clocks */ - clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT); - deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, - ~QEMU_TIMER_ATTR_EXTERNAL); - if (deadline < 0) { - static bool notified; - if (!icount_sleep && !notified) { - warn_report("icount sleep disabled and no active timers"); - notified =3D true; - } - return; - } - - if (deadline > 0) { - /* - * Ensure QEMU_CLOCK_VIRTUAL proceeds even when the virtual CPU go= es to - * sleep. Otherwise, the CPU might be waiting for a future timer - * interrupt to wake it up, but the interrupt never comes because - * the vCPU isn't running any insns and thus doesn't advance the - * QEMU_CLOCK_VIRTUAL. - */ - if (!icount_sleep) { - /* - * We never let VCPUs sleep in no sleep icount mode. - * If there is a pending QEMU_CLOCK_VIRTUAL timer we just adva= nce - * to the next QEMU_CLOCK_VIRTUAL event and notify it. - * It is useful when we want a deterministic execution time, - * isolated from host latencies. - */ - seqlock_write_lock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - atomic_set_i64(&timers_state.qemu_icount_bias, - timers_state.qemu_icount_bias + deadline); - seqlock_write_unlock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - qemu_clock_notify(QEMU_CLOCK_VIRTUAL); - } else { - /* - * We do stop VCPUs and only advance QEMU_CLOCK_VIRTUAL after = some - * "real" time, (related to the time left until the next event= ) has - * passed. The QEMU_CLOCK_VIRTUAL_RT clock will do this. - * This avoids that the warps are visible externally; for exam= ple, - * you will not be sending network packets continuously instea= d of - * every 100ms. - */ - seqlock_write_lock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - if (timers_state.vm_clock_warp_start =3D=3D -1 - || timers_state.vm_clock_warp_start > clock) { - timers_state.vm_clock_warp_start =3D clock; - } - seqlock_write_unlock(&timers_state.vm_clock_seqlock, - &timers_state.vm_clock_lock); - timer_mod_anticipate(timers_state.icount_warp_timer, - clock + deadline); - } - } else if (deadline =3D=3D 0) { - qemu_clock_notify(QEMU_CLOCK_VIRTUAL); - } -} - -static void qemu_account_warp_timer(void) -{ - if (!use_icount || !icount_sleep) { - return; - } - - /* Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers - * do not fire, so computing the deadline does not make sense. - */ - if (!runstate_is_running()) { - return; - } - - /* warp clock deterministically in record/replay mode */ - if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_ACCOUNT)) { - return; - } - - timer_del(timers_state.icount_warp_timer); - icount_warp_rt(); -} - -static bool icount_state_needed(void *opaque) -{ - return use_icount; -} - -static bool warp_timer_state_needed(void *opaque) -{ - TimersState *s =3D opaque; - return s->icount_warp_timer !=3D NULL; -} - -static bool adjust_timers_state_needed(void *opaque) -{ - TimersState *s =3D opaque; - return s->icount_rt_timer !=3D NULL; -} - -/* - * Subsection for warp timer migration is optional, because may not be cre= ated - */ -static const VMStateDescription icount_vmstate_warp_timer =3D { - .name =3D "timer/icount/warp_timer", - .version_id =3D 1, - .minimum_version_id =3D 1, - .needed =3D warp_timer_state_needed, - .fields =3D (VMStateField[]) { - VMSTATE_INT64(vm_clock_warp_start, TimersState), - VMSTATE_TIMER_PTR(icount_warp_timer, TimersState), - VMSTATE_END_OF_LIST() - } -}; - -static const VMStateDescription icount_vmstate_adjust_timers =3D { - .name =3D "timer/icount/timers", - .version_id =3D 1, - .minimum_version_id =3D 1, - .needed =3D adjust_timers_state_needed, - .fields =3D (VMStateField[]) { - VMSTATE_TIMER_PTR(icount_rt_timer, TimersState), - VMSTATE_TIMER_PTR(icount_vm_timer, TimersState), - VMSTATE_END_OF_LIST() - } -}; - -/* - * This is a subsection for icount migration. - */ -static const VMStateDescription icount_vmstate_timers =3D { - .name =3D "timer/icount", - .version_id =3D 1, - .minimum_version_id =3D 1, - .needed =3D icount_state_needed, - .fields =3D (VMStateField[]) { - VMSTATE_INT64(qemu_icount_bias, TimersState), - VMSTATE_INT64(qemu_icount, TimersState), - VMSTATE_END_OF_LIST() - }, - .subsections =3D (const VMStateDescription*[]) { - &icount_vmstate_warp_timer, - &icount_vmstate_adjust_timers, - NULL - } -}; - -static const VMStateDescription vmstate_timers =3D { - .name =3D "timer", - .version_id =3D 2, - .minimum_version_id =3D 1, - .fields =3D (VMStateField[]) { - VMSTATE_INT64(cpu_ticks_offset, TimersState), - VMSTATE_UNUSED(8), - VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2), - VMSTATE_END_OF_LIST() - }, - .subsections =3D (const VMStateDescription*[]) { - &icount_vmstate_timers, - NULL - } -}; - -void cpu_ticks_init(void) -{ - seqlock_init(&timers_state.vm_clock_seqlock); - qemu_spin_init(&timers_state.vm_clock_lock); - vmstate_register(NULL, 0, &vmstate_timers, &timers_state); - cpu_throttle_init(); -} - -void configure_icount(QemuOpts *opts, Error **errp) -{ - const char *option =3D qemu_opt_get(opts, "shift"); - bool sleep =3D qemu_opt_get_bool(opts, "sleep", true); - bool align =3D qemu_opt_get_bool(opts, "align", false); - long time_shift =3D -1; - - if (!option && qemu_opt_get(opts, "align")) { - error_setg(errp, "Please specify shift option when using align"); - return; - } - - if (align && !sleep) { - error_setg(errp, "align=3Don and sleep=3Doff are incompatible"); - return; - } - - if (strcmp(option, "auto") !=3D 0) { - if (qemu_strtol(option, NULL, 0, &time_shift) < 0 - || time_shift < 0 || time_shift > MAX_ICOUNT_SHIFT) { - error_setg(errp, "icount: Invalid shift value"); - return; - } - } else if (icount_align_option) { - error_setg(errp, "shift=3Dauto and align=3Don are incompatible"); - return; - } else if (!icount_sleep) { - error_setg(errp, "shift=3Dauto and sleep=3Doff are incompatible"); - return; - } - - icount_sleep =3D sleep; - if (icount_sleep) { - timers_state.icount_warp_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL= _RT, - icount_timer_cb, NULL); - } - - icount_align_option =3D align; - - if (time_shift >=3D 0) { - timers_state.icount_time_shift =3D time_shift; - use_icount =3D 1; - return; - } - - use_icount =3D 2; - - /* 125MIPS seems a reasonable initial guess at the guest speed. - It will be corrected fairly quickly anyway. */ - timers_state.icount_time_shift =3D 3; - - /* Have both realtime and virtual time triggers for speed adjustment. - The realtime trigger catches emulated time passing too slowly, - the virtual time trigger catches emulated time passing too fast. - Realtime triggers occur even when idle, so use them less frequently - than VM triggers. */ - timers_state.vm_clock_warp_start =3D -1; - timers_state.icount_rt_timer =3D timer_new_ms(QEMU_CLOCK_VIRTUAL_RT, - icount_adjust_rt, NULL); - timer_mod(timers_state.icount_rt_timer, - qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) + 1000); - timers_state.icount_vm_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL, - icount_adjust_vm, NULL); - timer_mod(timers_state.icount_vm_timer, - qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + - NANOSECONDS_PER_SECOND / 10); -} - /***********************************************************/ /* TCG vCPU kick timer * @@ -824,35 +160,6 @@ static void qemu_cpu_kick_rr_cpus(void) }; } =20 -static void do_nothing(CPUState *cpu, run_on_cpu_data unused) -{ -} - -void qemu_timer_notify_cb(void *opaque, QEMUClockType type) -{ - if (!use_icount || type !=3D QEMU_CLOCK_VIRTUAL) { - qemu_notify_event(); - return; - } - - if (qemu_in_vcpu_thread()) { - /* A CPU is currently running; kick it back out to the - * tcg_cpu_exec() loop so it will recalculate its - * icount deadline immediately. - */ - qemu_cpu_kick(current_cpu); - } else if (first_cpu) { - /* qemu_cpu_kick is not enough to kick a halted CPU out of - * qemu_tcg_wait_io_event. async_run_on_cpu, instead, - * causes cpu_thread_is_idle to return false. This way, - * handle_icount_deadline can run. - * If we have no CPUs at all for some reason, we don't - * need to do anything. - */ - async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL); - } -} - static void kick_tcg_thread(void *opaque) { timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); @@ -1254,7 +561,7 @@ static int64_t tcg_get_icount_limit(void) deadline =3D INT32_MAX; } =20 - return qemu_icount_round(deadline); + return icount_round(deadline); } else { return replay_get_instructions(); } @@ -1263,7 +570,7 @@ static int64_t tcg_get_icount_limit(void) static void handle_icount_deadline(void) { assert(qemu_in_vcpu_thread()); - if (use_icount) { + if (icount_enabled()) { int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, QEMU_TIMER_ATTR_ALL); =20 @@ -1277,7 +584,7 @@ static void handle_icount_deadline(void) =20 static void prepare_icount_for_run(CPUState *cpu) { - if (use_icount) { + if (icount_enabled()) { int insns_left; =20 /* These should always be cleared by process_icount_data after @@ -1298,9 +605,9 @@ static void prepare_icount_for_run(CPUState *cpu) =20 static void process_icount_data(CPUState *cpu) { - if (use_icount) { + if (icount_enabled()) { /* Account for executed instructions */ - cpu_update_icount(cpu); + icount_update(cpu); =20 /* Reset the counters */ cpu_neg(cpu)->icount_decr.u16.low =3D 0; @@ -1401,7 +708,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg) replay_mutex_lock(); qemu_mutex_lock_iothread(); /* Account partial waits to QEMU_CLOCK_VIRTUAL. */ - qemu_account_warp_timer(); + icount_account_warp_timer(); =20 /* Run the timers here. This is much more efficient than * waking up the I/O thread and waiting for completion. @@ -1459,7 +766,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg) atomic_mb_set(&cpu->exit_request, 0); } =20 - if (use_icount && all_cpu_threads_idle()) { + if (icount_enabled() && all_cpu_threads_idle()) { /* * When all cpus are sleeping (e.g in WFI), to avoid a deadlock * in the main_loop, wake it up in order to start the warp tim= er. @@ -1612,7 +919,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg) CPUState *cpu =3D arg; =20 assert(tcg_enabled()); - g_assert(!use_icount); + g_assert(!icount_enabled()); =20 rcu_register_thread(); tcg_register_thread(); @@ -2191,21 +1498,3 @@ void qmp_inject_nmi(Error **errp) nmi_monitor_handle(monitor_get_cpu_index(), errp); } =20 -void dump_drift_info(void) -{ - if (!use_icount) { - return; - } - - qemu_printf("Host - Guest clock %"PRIi64" ms\n", - (cpu_get_clock() - cpu_get_icount())/SCALE_MS); - if (icount_align_option) { - qemu_printf("Max guest delay %"PRIi64" ms\n", - -max_delay / SCALE_MS); - qemu_printf("Max guest advance %"PRIi64" ms\n", - max_advance / SCALE_MS); - } else { - qemu_printf("Max guest delay NA\n"); - qemu_printf("Max guest advance NA\n"); - } -} diff --git a/docs/replay.txt b/docs/replay.txt index 70c27edb36..8952e6d852 100644 --- a/docs/replay.txt +++ b/docs/replay.txt @@ -184,11 +184,11 @@ is then incremented (which is called "warping" the vi= rtual clock) as soon as the timer fires or the CPUs need to go out of the idle state. Two functions are used for this purpose; because these actions change virtual machine state and must be deterministic, each of them creates a -checkpoint. qemu_start_warp_timer checks if the CPUs are idle and if so -starts accounting real time to virtual clock. qemu_account_warp_timer +checkpoint. icount_start_warp_timer checks if the CPUs are idle and if so +starts accounting real time to virtual clock. icount_account_warp_timer is called when the CPUs get an interrupt or when the warp timer fires, and it warps the virtual clock by the amount of real time that has passed -since qemu_start_warp_timer. +since icount_start_warp_timer. =20 Bottom halves ------------- diff --git a/exec.c b/exec.c index 5162f0d12f..db9a90469b 100644 --- a/exec.c +++ b/exec.c @@ -104,10 +104,6 @@ uintptr_t qemu_host_page_size; intptr_t qemu_host_page_mask; =20 #if !defined(CONFIG_USER_ONLY) -/* 0 =3D Do not count executed instructions. - 1 =3D Precise instruction counting. - 2 =3D Adaptive rate instruction counting. */ -int use_icount; =20 typedef struct PhysPageEntry PhysPageEntry; =20 diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c index b5a54e2536..6c9f33208a 100644 --- a/hw/core/ptimer.c +++ b/hw/core/ptimer.c @@ -7,11 +7,11 @@ */ =20 #include "qemu/osdep.h" -#include "qemu/timer.h" #include "hw/ptimer.h" #include "migration/vmstate.h" #include "qemu/host-utils.h" #include "sysemu/replay.h" +#include "sysemu/cpu-timers.h" #include "sysemu/qtest.h" #include "block/aio.h" #include "sysemu/cpus.h" @@ -134,7 +134,7 @@ static void ptimer_reload(ptimer_state *s, int delta_ad= just) * on the current generation of host machines. */ =20 - if (s->enabled =3D=3D 1 && (delta * period < 10000) && !use_icount) { + if (s->enabled =3D=3D 1 && (delta * period < 10000) && !icount_enabled= ()) { period =3D 10000 / delta; period_frac =3D 0; } @@ -217,7 +217,7 @@ uint64_t ptimer_get_count(ptimer_state *s) uint32_t period_frac =3D s->period_frac; uint64_t period =3D s->period; =20 - if (!oneshot && (s->delta * period < 10000) && !use_icount) { + if (!oneshot && (s->delta * period < 10000) && !icount_enabled= ()) { period =3D 10000 / s->delta; period_frac =3D 0; } diff --git a/hw/i386/x86.c b/hw/i386/x86.c index 7a3bc7ab66..002b3cabc2 100644 --- a/hw/i386/x86.c +++ b/hw/i386/x86.c @@ -34,6 +34,7 @@ #include "sysemu/numa.h" #include "sysemu/replay.h" #include "sysemu/sysemu.h" +#include "sysemu/cpu-timers.h" #include "trace.h" =20 #include "hw/i386/x86.h" diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h index d14374bdd4..49eedd714d 100644 --- a/include/exec/cpu-all.h +++ b/include/exec/cpu-all.h @@ -409,8 +409,12 @@ static inline bool tlb_hit(target_ulong tlb_addr, targ= et_ulong addr) return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK); } =20 +#ifdef CONFIG_TCG +void dump_drift_info(void); void dump_exec_info(void); void dump_opcount_info(void); +#endif /* CONFIG_TCG */ + #endif /* !CONFIG_USER_ONLY */ =20 int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr, diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 8792bea07a..c1f51e37af 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -25,7 +25,7 @@ #ifdef CONFIG_TCG #include "exec/cpu_ldst.h" #endif -#include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" =20 /* allow to see translation results - the slowdown should be negligible, s= o we leave it */ #define DEBUG_DISAS @@ -489,7 +489,7 @@ static inline uint32_t tb_cflags(const TranslationBlock= *tb) static inline uint32_t curr_cflags(void) { return (parallel_cpus ? CF_PARALLEL : 0) - | (use_icount ? CF_USE_ICOUNT : 0); + | (icount_enabled() ? CF_USE_ICOUNT : 0); } =20 /* TranslationBlock invalidate API */ diff --git a/include/qemu/timer.h b/include/qemu/timer.h index 6a8b48b5a9..c54b1b2813 100644 --- a/include/qemu/timer.h +++ b/include/qemu/timer.h @@ -224,13 +224,6 @@ void qemu_clock_notify(QEMUClockType type); */ void qemu_clock_enable(QEMUClockType type, bool enabled); =20 -/** - * qemu_start_warp_timer: - * - * Starts a timer for virtual clock update - */ -void qemu_start_warp_timer(void); - /** * qemu_clock_run_timers: * @type: clock on which to operate @@ -791,12 +784,6 @@ static inline int64_t qemu_soonest_timeout(int64_t tim= eout1, int64_t timeout2) */ void init_clocks(QEMUTimerListNotifyCB *notify_cb); =20 -int64_t cpu_get_ticks(void); -/* Caller must hold BQL */ -void cpu_enable_ticks(void); -/* Caller must hold BQL */ -void cpu_disable_ticks(void); - static inline int64_t get_max_clock_jump(void) { /* This should be small enough to prevent excessive interrupts from be= ing @@ -850,13 +837,6 @@ static inline int64_t get_clock(void) } #endif =20 -/* icount */ -int64_t cpu_get_icount_raw(void); -int64_t cpu_get_icount(void); -int64_t cpu_get_clock(void); -int64_t cpu_icount_to_ns(int64_t icount); -void cpu_update_icount(CPUState *cpu); - /*******************************************/ /* host CPU ticks (if available) */ =20 diff --git a/include/sysemu/cpu-timers.h b/include/sysemu/cpu-timers.h new file mode 100644 index 0000000000..3db579fde7 --- /dev/null +++ b/include/sysemu/cpu-timers.h @@ -0,0 +1,73 @@ +#ifndef SYSEMU_CPU_TIMERS_H +#define SYSEMU_CPU_TIMERS_H + +#include "qemu/timer.h" + +/* init the whole cpu timers API, including icount, ticks, and cpu_throttl= e */ +void cpu_timers_init(void); + +/* icount - Instruction Counter API */ + +/* + * Return the icount enablement state: + * + * 0 =3D Disabled - Do not count executed instructions. + * 1 =3D Enabled - Fixed conversion of insn to ns via "shift" option + * 2 =3D Enabled - Runtime adaptive algorithm to compute shift + */ +int icount_enabled(void); +/* + * Update the icount with the executed instructions. Called by + * cpus-tcg vCPU thread so the main-loop can see time has moved forward. + */ +void icount_update(CPUState *cpu); + +/* get raw icount value */ +int64_t icount_get_raw(void); + +/* return the virtual CPU time in ns, based on the instruction counter. */ +int64_t icount_get(void); +/* + * convert an instruction counter value to ns, based on the icount shift. + * This shift is set as a fixed value with the icount "shift" option + * (precise mode), or it is constantly approximated and corrected at + * runtime in adaptive mode. + */ +int64_t icount_to_ns(int64_t icount); + +/* configure the icount options, including "shift" */ +void icount_configure(QemuOpts *opts, Error **errp); + +/* used by tcg vcpu thread to calc icount budget */ +int64_t icount_round(int64_t count); + +/* if the CPUs are idle, start accounting real time to virtual clock. */ +void icount_start_warp_timer(void); +void icount_account_warp_timer(void); + +/* + * CPU Ticks and Clock + */ + +/* Caller must hold BQL */ +void cpu_enable_ticks(void); +/* Caller must hold BQL */ +void cpu_disable_ticks(void); + +/* + * return the time elapsed in VM between vm_start and vm_stop. Unless + * icount is active, cpu_get_ticks() uses units of the host CPU cycle + * counter. + */ +int64_t cpu_get_ticks(void); + +/* + * Returns the monotonic time elapsed in VM, i.e., + * the time between vm_start and vm_stop + */ +int64_t cpu_get_clock(void); + +void qemu_timer_notify_cb(void *opaque, QEMUClockType type); +void qtest_clock_warp(int64_t dest); + +#endif /* SYSEMU_CPU_TIMERS_H */ diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h index 3c1da6a018..149de000a0 100644 --- a/include/sysemu/cpus.h +++ b/include/sysemu/cpus.h @@ -4,33 +4,23 @@ #include "qemu/timer.h" =20 /* cpus.c */ +bool all_cpu_threads_idle(void); bool qemu_in_vcpu_thread(void); void qemu_init_cpu_loop(void); void resume_all_vcpus(void); void pause_all_vcpus(void); void cpu_stop_current(void); -void cpu_ticks_init(void); =20 -void configure_icount(QemuOpts *opts, Error **errp); -extern int use_icount; extern int icount_align_option; =20 -/* drift information for info jit command */ -extern int64_t max_delay; -extern int64_t max_advance; -void dump_drift_info(void); - /* Unblock cpu */ void qemu_cpu_kick_self(void); -void qemu_timer_notify_cb(void *opaque, QEMUClockType type); =20 void cpu_synchronize_all_states(void); void cpu_synchronize_all_post_reset(void); void cpu_synchronize_all_post_init(void); void cpu_synchronize_all_pre_loadvm(void); =20 -void qtest_clock_warp(int64_t dest); - #ifndef CONFIG_USER_ONLY /* vl.c */ /* *-user doesn't have configurable SMP topology */ diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h index 5471bb514d..a140d69a73 100644 --- a/include/sysemu/replay.h +++ b/include/sysemu/replay.h @@ -109,12 +109,12 @@ int64_t replay_read_clock(ReplayClockKind kind); #define REPLAY_CLOCK(clock, value) \ (replay_mode =3D=3D REPLAY_MODE_PLAY ? replay_read_clock((clock)) = \ : replay_mode =3D=3D REPLAY_MODE_RECORD = \ - ? replay_save_clock((clock), (value), cpu_get_icount_raw()) \ + ? replay_save_clock((clock), (value), icount_get_raw()) \ : (value)) #define REPLAY_CLOCK_LOCKED(clock, value) \ (replay_mode =3D=3D REPLAY_MODE_PLAY ? replay_read_clock((clock)) = \ : replay_mode =3D=3D REPLAY_MODE_RECORD = \ - ? replay_save_clock((clock), (value), cpu_get_icount_raw_locke= d()) \ + ? replay_save_clock((clock), (value), icount_get_raw_locked())= \ : (value)) =20 /* Processing data from random generators */ diff --git a/qtest.c b/qtest.c index 5672b75c35..a1b92853c9 100644 --- a/qtest.c +++ b/qtest.c @@ -21,7 +21,7 @@ #include "exec/memory.h" #include "hw/irq.h" #include "sysemu/accel.h" -#include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" #include "qemu/config-file.h" #include "qemu/option.h" #include "qemu/error-report.h" diff --git a/replay/replay.c b/replay/replay.c index 706c7b4f4b..9896a3b6f5 100644 --- a/replay/replay.c +++ b/replay/replay.c @@ -11,10 +11,10 @@ =20 #include "qemu/osdep.h" #include "qapi/error.h" +#include "sysemu/cpu-timers.h" #include "sysemu/replay.h" #include "sysemu/runstate.h" #include "replay-internal.h" -#include "qemu/timer.h" #include "qemu/main-loop.h" #include "qemu/option.h" #include "sysemu/cpus.h" @@ -64,7 +64,7 @@ bool replay_next_event_is(int event) =20 uint64_t replay_get_current_icount(void) { - return cpu_get_icount_raw(); + return icount_get_raw(); } =20 int replay_get_instructions(void) @@ -345,7 +345,7 @@ void replay_start(void) error_reportf_err(replay_blockers->data, "Record/replay: "); exit(1); } - if (!use_icount) { + if (!icount_enabled()) { error_report("Please enable icount to use record/replay"); exit(1); } diff --git a/softmmu/vl.c b/softmmu/vl.c index ae5451bc23..ed53cd1b62 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -73,6 +73,7 @@ #include "hw/audio/soundhw.h" #include "audio/audio.h" #include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" #include "migration/colo.h" #include "migration/postcopy-ram.h" #include "sysemu/kvm.h" @@ -2675,7 +2676,7 @@ static void user_register_global_props(void) =20 static int do_configure_icount(void *opaque, QemuOpts *opts, Error **errp) { - configure_icount(opts, errp); + icount_configure(opts, errp); return 0; } =20 @@ -2785,7 +2786,7 @@ static void configure_accelerators(const char *progna= me) error_report("falling back to %s", ac->name); } =20 - if (use_icount && !(tcg_enabled() || qtest_enabled())) { + if (icount_enabled() && !(tcg_enabled() || qtest_enabled())) { error_report("-icount is not allowed with hardware virtualization"= ); exit(1); } @@ -4233,7 +4234,8 @@ void qemu_init(int argc, char **argv, char **envp) /* spice needs the timers to be initialized by this point */ qemu_spice_init(); =20 - cpu_ticks_init(); + /* initialize cpu timers and VCPU throttle modules */ + cpu_timers_init(); =20 if (default_net) { QemuOptsList *net =3D qemu_find_opts("net"); diff --git a/stubs/clock-warp.c b/stubs/clock-warp.c index b53e5dd94c..304da5091c 100644 --- a/stubs/clock-warp.c +++ b/stubs/clock-warp.c @@ -1,7 +1,7 @@ #include "qemu/osdep.h" -#include "qemu/timer.h" +#include "sysemu/cpu-timers.h" =20 -void qemu_start_warp_timer(void) +void icount_start_warp_timer(void) { } =20 diff --git a/stubs/cpu-get-clock.c b/stubs/cpu-get-clock.c index 5a92810e87..6102338743 100644 --- a/stubs/cpu-get-clock.c +++ b/stubs/cpu-get-clock.c @@ -1,5 +1,5 @@ #include "qemu/osdep.h" -#include "qemu/timer.h" +#include "sysemu/cpu-timers.h" =20 int64_t cpu_get_clock(void) { diff --git a/stubs/cpu-get-icount.c b/stubs/cpu-get-icount.c index b35f844638..23f9154ef3 100644 --- a/stubs/cpu-get-icount.c +++ b/stubs/cpu-get-icount.c @@ -1,20 +1,22 @@ #include "qemu/osdep.h" -#include "qemu/timer.h" -#include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" #include "qemu/main-loop.h" =20 -int use_icount; - -int64_t cpu_get_icount(void) +int64_t icount_get(void) { abort(); } =20 -int64_t cpu_get_icount_raw(void) +int64_t icount_get_raw(void) { abort(); } =20 +int icount_enabled(void) +{ + return 0; +} + void qemu_timer_notify_cb(void *opaque, QEMUClockType type) { qemu_notify_event(); diff --git a/target/alpha/translate.c b/target/alpha/translate.c index 8870284f57..36be602179 100644 --- a/target/alpha/translate.c +++ b/target/alpha/translate.c @@ -20,6 +20,7 @@ #include "qemu/osdep.h" #include "cpu.h" #include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" #include "disas/disas.h" #include "qemu/host-utils.h" #include "exec/exec-all.h" @@ -1329,7 +1330,7 @@ static DisasJumpType gen_mfpr(DisasContext *ctx, TCGv= va, int regno) case 249: /* VMTIME */ helper =3D gen_helper_get_vmtime; do_helper: - if (use_icount) { + if (icount_enabled()) { gen_io_start(); helper(va); return DISAS_PC_STALE; diff --git a/target/arm/helper.c b/target/arm/helper.c index a92ae55672..c9f99f7952 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -24,6 +24,7 @@ #include "hw/irq.h" #include "hw/semihosting/semihost.h" #include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" #include "sysemu/kvm.h" #include "sysemu/tcg.h" #include "qemu/range.h" @@ -1205,17 +1206,17 @@ static int64_t cycles_ns_per(uint64_t cycles) =20 static bool instructions_supported(CPUARMState *env) { - return use_icount =3D=3D 1 /* Precise instruction counting */; + return icount_enabled() =3D=3D 1; /* Precise instruction counting */ } =20 static uint64_t instructions_get_count(CPUARMState *env) { - return (uint64_t)cpu_get_icount_raw(); + return (uint64_t)icount_get_raw(); } =20 static int64_t instructions_ns_per(uint64_t icount) { - return cpu_icount_to_ns((int64_t)icount); + return icount_to_ns((int64_t)icount); } #endif =20 diff --git a/target/riscv/csr.c b/target/riscv/csr.c index 11d184cd16..6093f73e3a 100644 --- a/target/riscv/csr.c +++ b/target/riscv/csr.c @@ -194,8 +194,8 @@ static int write_fcsr(CPURISCVState *env, int csrno, ta= rget_ulong val) static int read_instret(CPURISCVState *env, int csrno, target_ulong *val) { #if !defined(CONFIG_USER_ONLY) - if (use_icount) { - *val =3D cpu_get_icount(); + if (icount_enabled()) { + *val =3D icount_get(); } else { *val =3D cpu_get_host_ticks(); } @@ -209,8 +209,8 @@ static int read_instret(CPURISCVState *env, int csrno, = target_ulong *val) static int read_instreth(CPURISCVState *env, int csrno, target_ulong *val) { #if !defined(CONFIG_USER_ONLY) - if (use_icount) { - *val =3D cpu_get_icount() >> 32; + if (icount_enabled()) { + *val =3D icount_get() >> 32; } else { *val =3D cpu_get_host_ticks() >> 32; } diff --git a/tests/ptimer-test-stubs.c b/tests/ptimer-test-stubs.c index ed393d9082..320dcf99b7 100644 --- a/tests/ptimer-test-stubs.c +++ b/tests/ptimer-test-stubs.c @@ -12,6 +12,7 @@ #include "qemu/main-loop.h" #include "sysemu/replay.h" #include "migration/vmstate.h" +#include "sysemu/cpu-timers.h" =20 #include "ptimer-test.h" =20 @@ -126,3 +127,8 @@ void replay_bh_schedule_event(QEMUBH *bh) { bh->cb(bh->opaque); } + +int icount_enabled(void) +{ + return 0; +} diff --git a/tests/test-timed-average.c b/tests/test-timed-average.c index e2bcf5fe13..82c92500df 100644 --- a/tests/test-timed-average.c +++ b/tests/test-timed-average.c @@ -11,7 +11,7 @@ */ =20 #include "qemu/osdep.h" - +#include "sysemu/cpu-timers.h" #include "qemu/timed-average.h" =20 /* This is the clock for QEMU_CLOCK_VIRTUAL */ diff --git a/util/main-loop.c b/util/main-loop.c index eda63fe4e0..f1af697572 100644 --- a/util/main-loop.c +++ b/util/main-loop.c @@ -27,7 +27,7 @@ #include "qemu/cutils.h" #include "qemu/timer.h" #include "sysemu/qtest.h" -#include "sysemu/cpus.h" +#include "sysemu/cpu-timers.h" #include "sysemu/replay.h" #include "qemu/main-loop.h" #include "block/aio.h" @@ -521,7 +521,7 @@ void main_loop_wait(int nonblocking) =20 /* CPU thread can infinitely wait for event after missing the warp */ - qemu_start_warp_timer(); + icount_start_warp_timer(); qemu_clock_run_all_timers(); } =20 diff --git a/util/qemu-timer.c b/util/qemu-timer.c index b6575a2cd5..da2883f914 100644 --- a/util/qemu-timer.c +++ b/util/qemu-timer.c @@ -26,6 +26,7 @@ #include "qemu/main-loop.h" #include "qemu/timer.h" #include "qemu/lockable.h" +#include "sysemu/cpu-timers.h" #include "sysemu/replay.h" #include "sysemu/cpus.h" =20 @@ -134,7 +135,7 @@ static void qemu_clock_init(QEMUClockType type, QEMUTim= erListNotifyCB *notify_cb =20 bool qemu_clock_use_for_deadline(QEMUClockType type) { - return !(use_icount && (type =3D=3D QEMU_CLOCK_VIRTUAL)); + return !(icount_enabled() && (type =3D=3D QEMU_CLOCK_VIRTUAL)); } =20 void qemu_clock_notify(QEMUClockType type) @@ -417,7 +418,7 @@ static void timerlist_rearm(QEMUTimerList *timer_list) { /* Interrupt execution to force deadline recalculation. */ if (timer_list->clock->type =3D=3D QEMU_CLOCK_VIRTUAL) { - qemu_start_warp_timer(); + icount_start_warp_timer(); } timerlist_notify(timer_list); } @@ -647,8 +648,8 @@ int64_t qemu_clock_get_ns(QEMUClockType type) return get_clock(); default: case QEMU_CLOCK_VIRTUAL: - if (use_icount) { - return cpu_get_icount(); + if (icount_enabled()) { + return icount_get(); } else { return cpu_get_clock(); } --=20 2.16.4 From nobody Thu Apr 25 09:48:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1590087409; cv=none; d=zohomail.com; s=zohoarc; b=nF7PQZtGVp4Og/6D4+17O2swMOmauixky/uoUT6LsAv0GQxX5w+ZAPT1APm38wnlzVmCnZr3UG19d8tOTo5ZCq6PZx2znD32WqnrZ/drRpjznIwEbRfHag7wajy7fmDwTw5as81bthTOoliEk4pxEqcVb6/rWEp/ctLCdECURP0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1590087409; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To; bh=l6atjsHJE8QS7cd8+REZO2FvWqHbovmDKJiCW8SE6aw=; b=hgFeOb83M9ct6inU/8CJYi3JjdWz1D7skXpj3WdPUPEFJzEXBBGqa5w21DsL8w8gdpPwQ9CsBN3Qqvy4v90HZYlDVHtVDnZneZggGGQXyHYpUZG4+RoaGbfyEy5hpcxFcnVgQYCNIXghmt9Yds0sjP9N0K5cUyGgXIz8qVZOUXc= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1590087409506544.5809930320231; Thu, 21 May 2020 11:56:49 -0700 (PDT) Received: from localhost ([::1]:47126 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jbqN5-0004Nw-Rk for importer@patchew.org; Thu, 21 May 2020 14:56:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35316) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbqKh-00014Y-Uj for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:20 -0400 Received: from mx2.suse.de ([195.135.220.15]:49938) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbqKc-00042A-WE for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:19 -0400 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id E4191B01C; Thu, 21 May 2020 18:54:15 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de From: Claudio Fontana To: Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Peter Maydell , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [RFC 3/3] cpus: implement cpus interfaces for per-accel threads Date: Thu, 21 May 2020 20:54:07 +0200 Message-Id: <20200521185407.25311-4-cfontana@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200521185407.25311-1-cfontana@suse.de> References: <20200521185407.25311-1-cfontana@suse.de> Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=195.135.220.15; envelope-from=cfontana@suse.de; helo=mx2.suse.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/21 02:02:44 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x (no timestamps) [generic] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Thomas Huth , Eduardo Habkost , Marcelo Tosatti , "open list:All patches CC here" , Roman Bolshakov , Wenchao Wang , Colin Xu , Claudio Fontana , "open list:X86 HAXM CPUs" , Sunil Muthuswamy , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Claudio Fontana --- MAINTAINERS | 1 + accel/kvm/Makefile.objs | 2 + accel/kvm/kvm-all.c | 15 +- accel/kvm/kvm-cpus-interface.c | 94 ++++ accel/kvm/kvm-cpus-interface.h | 8 + accel/qtest.c | 82 ++++ accel/stubs/kvm-stub.c | 3 +- accel/tcg/Makefile.objs | 1 + accel/tcg/tcg-all.c | 12 +- accel/tcg/tcg-cpus-interface.c | 523 ++++++++++++++++++++ accel/tcg/tcg-cpus-interface.h | 8 + cpus.c | 911 +++----------------------------= ---- hw/core/cpu.c | 1 + include/sysemu/cpus.h | 44 ++ include/sysemu/hvf.h | 1 - include/sysemu/hw_accel.h | 57 +-- include/sysemu/kvm.h | 2 +- stubs/Makefile.objs | 1 + stubs/cpu-synchronize-state.c | 15 + target/i386/Makefile.objs | 7 +- target/i386/hax-all.c | 6 +- target/i386/hax-cpus-interface.c | 85 ++++ target/i386/hax-cpus-interface.h | 8 + target/i386/hax-i386.h | 2 + target/i386/hax-posix.c | 12 + target/i386/hax-windows.c | 20 + target/i386/hvf/Makefile.objs | 2 +- target/i386/hvf/hvf-cpus-interface.c | 92 ++++ target/i386/hvf/hvf-cpus-interface.h | 8 + target/i386/hvf/hvf.c | 5 +- target/i386/whpx-all.c | 3 + target/i386/whpx-cpus-interface.c | 96 ++++ target/i386/whpx-cpus-interface.h | 8 + 33 files changed, 1232 insertions(+), 903 deletions(-) create mode 100644 accel/kvm/kvm-cpus-interface.c create mode 100644 accel/kvm/kvm-cpus-interface.h create mode 100644 accel/tcg/tcg-cpus-interface.c create mode 100644 accel/tcg/tcg-cpus-interface.h create mode 100644 stubs/cpu-synchronize-state.c create mode 100644 target/i386/hax-cpus-interface.c create mode 100644 target/i386/hax-cpus-interface.h create mode 100644 target/i386/hvf/hvf-cpus-interface.c create mode 100644 target/i386/hvf/hvf-cpus-interface.h create mode 100644 target/i386/whpx-cpus-interface.c create mode 100644 target/i386/whpx-cpus-interface.h diff --git a/MAINTAINERS b/MAINTAINERS index 1b3b17fda8..a256d5574c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -426,6 +426,7 @@ WHPX CPUs M: Sunil Muthuswamy S: Supported F: target/i386/whpx-all.c +F: target/i386/whpx-cpus-interface.c F: target/i386/whp-dispatch.h F: accel/stubs/whpx-stub.c F: include/sysemu/whpx.h diff --git a/accel/kvm/Makefile.objs b/accel/kvm/Makefile.objs index fdfa481578..4babbf7796 100644 --- a/accel/kvm/Makefile.objs +++ b/accel/kvm/Makefile.objs @@ -1,2 +1,4 @@ obj-y +=3D kvm-all.o +obj-y +=3D kvm-cpus-interface.o + obj-$(call lnot,$(CONFIG_SEV)) +=3D sev-stub.o diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index d06cc04079..c9cbbb1184 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -45,6 +45,10 @@ #include "qapi/qapi-types-common.h" #include "qapi/qapi-visit-common.h" #include "sysemu/reset.h" +#include "qemu/guest-random.h" + +#include "sysemu/hw_accel.h" +#include "kvm-cpus-interface.h" =20 #include "hw/boards.h" =20 @@ -329,7 +333,7 @@ err: return ret; } =20 -int kvm_destroy_vcpu(CPUState *cpu) +static int do_kvm_destroy_vcpu(CPUState *cpu) { KVMState *s =3D kvm_state; long mmap_size; @@ -363,6 +367,14 @@ err: return ret; } =20 +void kvm_destroy_vcpu(CPUState *cpu) +{ + if (do_kvm_destroy_vcpu(cpu) < 0) { + error_report("kvm_destroy_vcpu failed"); + exit(EXIT_FAILURE); + } +} + static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id) { struct KVMParkedVcpu *cpu; @@ -2146,6 +2158,7 @@ static int kvm_init(MachineState *ms) qemu_balloon_inhibit(true); } =20 + cpus_register_accel_interface(&kvm_cpus_interface); return 0; =20 err: diff --git a/accel/kvm/kvm-cpus-interface.c b/accel/kvm/kvm-cpus-interface.c new file mode 100644 index 0000000000..fd3d117364 --- /dev/null +++ b/accel/kvm/kvm-cpus-interface.c @@ -0,0 +1,94 @@ +/* + * QEMU KVM support + * + * Copyright IBM, Corp. 2008 + * Red Hat, Inc. 2008 + * + * Authors: + * Anthony Liguori + * Glauber Costa + * + * This work is licensed under the terms of the GNU GPL, version 2 or late= r. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "qemu/main-loop.h" +#include "sysemu/kvm_int.h" +#include "sysemu/runstate.h" +#include "sysemu/cpus.h" +#include "qemu/guest-random.h" + +#include "kvm-cpus-interface.h" + +static void kvm_kick_vcpu_thread(CPUState *cpu) +{ + cpus_kick_thread(cpu); +} + +static void *kvm_vcpu_thread_fn(void *arg) +{ + CPUState *cpu =3D arg; + int r; + + rcu_register_thread(); + + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + cpu->thread_id =3D qemu_get_thread_id(); + cpu->can_do_io =3D 1; + current_cpu =3D cpu; + + r =3D kvm_init_vcpu(cpu); + if (r < 0) { + error_report("kvm_init_vcpu failed: %s", strerror(-r)); + exit(1); + } + + kvm_init_cpu_signals(cpu); + + /* signal CPU creation */ + cpu_thread_signal_created(cpu); + qemu_guest_random_seed_thread_part2(cpu->random_seed); + + do { + if (cpu_can_run(cpu)) { + r =3D kvm_cpu_exec(cpu); + if (r =3D=3D EXCP_DEBUG) { + cpu_handle_guest_debug(cpu); + } + } + qemu_wait_io_event(cpu); + } while (!cpu->unplug || cpu_can_run(cpu)); + + kvm_destroy_vcpu(cpu); + cpu_thread_signal_destroyed(cpu); + qemu_mutex_unlock_iothread(); + rcu_unregister_thread(); + return NULL; +} + +static void kvm_start_vcpu_thread(CPUState *cpu) +{ + char thread_name[VCPU_THREAD_NAME_SIZE]; + + cpu->thread =3D g_malloc0(sizeof(QemuThread)); + cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); + qemu_cond_init(cpu->halt_cond); + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM", + cpu->cpu_index); + qemu_thread_create(cpu->thread, thread_name, kvm_vcpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); +} + +CpusAccelInterface kvm_cpus_interface =3D { + .create_vcpu_thread =3D kvm_start_vcpu_thread, + .kick_vcpu_thread =3D kvm_kick_vcpu_thread, + + .cpu_synchronize_post_reset =3D kvm_cpu_synchronize_post_reset, + .cpu_synchronize_post_init =3D kvm_cpu_synchronize_post_init, + .cpu_synchronize_state =3D kvm_cpu_synchronize_state, + .cpu_synchronize_pre_loadvm =3D kvm_cpu_synchronize_pre_loadvm, +}; diff --git a/accel/kvm/kvm-cpus-interface.h b/accel/kvm/kvm-cpus-interface.h new file mode 100644 index 0000000000..5531a4a4ad --- /dev/null +++ b/accel/kvm/kvm-cpus-interface.h @@ -0,0 +1,8 @@ +#ifndef KVM_CPUS_INTERFACE_H +#define KVM_CPUS_INTERFACE_H + +#include "sysemu/cpus.h" + +extern CpusAccelInterface kvm_cpus_interface; + +#endif /* KVM_CPUS_INTERFACE */ diff --git a/accel/qtest.c b/accel/qtest.c index ef9ee0941a..1677c8724d 100644 --- a/accel/qtest.c +++ b/accel/qtest.c @@ -12,6 +12,7 @@ */ =20 #include "qemu/osdep.h" +#include "qemu/rcu.h" #include "qapi/error.h" #include "qemu/module.h" #include "qemu/option.h" @@ -20,6 +21,86 @@ #include "sysemu/qtest.h" #include "sysemu/cpus.h" #include "sysemu/cpu-timers.h" +#include "qemu/guest-random.h" +#include "qemu/main-loop.h" +#include "hw/core/cpu.h" + +static void qtest_cpu_synchronize_noop(CPUState *cpu) +{ +} + +static void qtest_kick_vcpu_thread(CPUState *cpu) +{ + cpus_kick_thread(cpu); +} + +static void *qtest_cpu_thread_fn(void *arg) +{ +#ifdef _WIN32 + error_report("qtest is not supported under Windows"); + exit(1); +#else + CPUState *cpu =3D arg; + sigset_t waitset; + int r; + + rcu_register_thread(); + + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + cpu->thread_id =3D qemu_get_thread_id(); + cpu->can_do_io =3D 1; + current_cpu =3D cpu; + + sigemptyset(&waitset); + sigaddset(&waitset, SIG_IPI); + + /* signal CPU creation */ + cpu_thread_signal_created(cpu); + qemu_guest_random_seed_thread_part2(cpu->random_seed); + + do { + qemu_mutex_unlock_iothread(); + do { + int sig; + r =3D sigwait(&waitset, &sig); + } while (r =3D=3D -1 && (errno =3D=3D EAGAIN || errno =3D=3D EINTR= )); + if (r =3D=3D -1) { + perror("sigwait"); + exit(1); + } + qemu_mutex_lock_iothread(); + qemu_wait_io_event(cpu); + } while (!cpu->unplug); + + qemu_mutex_unlock_iothread(); + rcu_unregister_thread(); + return NULL; +#endif +} + +static void qtest_start_vcpu_thread(CPUState *cpu) +{ + char thread_name[VCPU_THREAD_NAME_SIZE]; + + cpu->thread =3D g_malloc0(sizeof(QemuThread)); + cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); + qemu_cond_init(cpu->halt_cond); + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY", + cpu->cpu_index); + qemu_thread_create(cpu->thread, thread_name, qtest_cpu_thread_fn, cpu, + QEMU_THREAD_JOINABLE); +} + +CpusAccelInterface qtest_cpus_interface =3D { + .create_vcpu_thread =3D qtest_start_vcpu_thread, + .kick_vcpu_thread =3D qtest_kick_vcpu_thread, + + .cpu_synchronize_post_reset =3D qtest_cpu_synchronize_noop, + .cpu_synchronize_post_init =3D qtest_cpu_synchronize_noop, + .cpu_synchronize_state =3D qtest_cpu_synchronize_noop, + .cpu_synchronize_pre_loadvm =3D qtest_cpu_synchronize_noop, +}; =20 static int qtest_init_accel(MachineState *ms) { @@ -28,6 +109,7 @@ static int qtest_init_accel(MachineState *ms) qemu_opt_set(opts, "shift", "0", &error_abort); icount_configure(opts, &error_abort); qemu_opts_del(opts); + cpus_register_accel_interface(&qtest_cpus_interface); return 0; } =20 diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c index 82f118d2df..69f8a842da 100644 --- a/accel/stubs/kvm-stub.c +++ b/accel/stubs/kvm-stub.c @@ -32,9 +32,8 @@ bool kvm_readonly_mem_allowed; bool kvm_ioeventfd_any_length_allowed; bool kvm_msi_use_devid; =20 -int kvm_destroy_vcpu(CPUState *cpu) +void kvm_destroy_vcpu(CPUState *cpu) { - return -ENOSYS; } =20 int kvm_init_vcpu(CPUState *cpu) diff --git a/accel/tcg/Makefile.objs b/accel/tcg/Makefile.objs index a92f2c454b..ddc57acae2 100644 --- a/accel/tcg/Makefile.objs +++ b/accel/tcg/Makefile.objs @@ -1,5 +1,6 @@ obj-$(CONFIG_SOFTMMU) +=3D tcg-all.o obj-$(CONFIG_SOFTMMU) +=3D cputlb.o +obj-$(CONFIG_SOFTMMU) +=3D tcg-cpus-interface.o obj-y +=3D tcg-runtime.o tcg-runtime-gvec.o obj-y +=3D cpu-exec.o cpu-exec-common.o translate-all.o obj-y +=3D translator.o diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c index e27385d051..9e332585d3 100644 --- a/accel/tcg/tcg-all.c +++ b/accel/tcg/tcg-all.c @@ -24,19 +24,17 @@ */ =20 #include "qemu/osdep.h" -#include "sysemu/accel.h" +#include "qemu-common.h" #include "sysemu/tcg.h" -#include "qom/object.h" -#include "cpu.h" -#include "sysemu/cpus.h" #include "sysemu/cpu-timers.h" -#include "qemu/main-loop.h" #include "tcg/tcg.h" #include "qapi/error.h" #include "qemu/error-report.h" #include "hw/boards.h" #include "qapi/qapi-builtin-visit.h" =20 +#include "tcg-cpus-interface.h" + typedef struct TCGState { AccelState parent_obj; =20 @@ -123,6 +121,8 @@ static void tcg_accel_instance_init(Object *obj) s->mttcg_enabled =3D default_mttcg_enabled(); } =20 +bool mttcg_enabled; + static int tcg_init(MachineState *ms) { TCGState *s =3D TCG_STATE(current_accel()); @@ -130,6 +130,8 @@ static int tcg_init(MachineState *ms) tcg_exec_init(s->tb_size * 1024 * 1024); cpu_interrupt_handler =3D tcg_handle_interrupt; mttcg_enabled =3D s->mttcg_enabled; + cpus_register_accel_interface(&tcg_cpus_interface); + return 0; } =20 diff --git a/accel/tcg/tcg-cpus-interface.c b/accel/tcg/tcg-cpus-interface.c new file mode 100644 index 0000000000..28a88beb84 --- /dev/null +++ b/accel/tcg/tcg-cpus-interface.c @@ -0,0 +1,523 @@ +/* + * QEMU System Emulator, accelerator interfaces + * + * Copyright (c) 2003-2008 Fabrice Bellard + * Copyright (c) 2014 Red Hat Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a= copy + * of this software and associated documentation files (the "Software"), t= o deal + * in the Software without restriction, including without limitation the r= ights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or se= ll + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included= in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS= OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OT= HER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING= FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS = IN + * THE SOFTWARE. + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" +#include "sysemu/tcg.h" +#include "sysemu/replay.h" +#include "qemu/main-loop.h" +#include "qemu/guest-random.h" +#include "exec/exec-all.h" + +#include "tcg-cpus-interface.h" + +/* Kick all RR vCPUs */ +static void qemu_cpu_kick_rr_cpus(void) +{ + CPUState *cpu; + + CPU_FOREACH(cpu) { + cpu_exit(cpu); + }; +} + +static void tcg_kick_vcpu_thread(CPUState *cpu) +{ + if (qemu_tcg_mttcg_enabled()) { + cpu_exit(cpu); + } else { + qemu_cpu_kick_rr_cpus(); + } +} + +/* + * TCG vCPU kick timer + * + * The kick timer is responsible for moving single threaded vCPU + * emulation on to the next vCPU. If more than one vCPU is running a + * timer event with force a cpu->exit so the next vCPU can get + * scheduled. + * + * The timer is removed if all vCPUs are idle and restarted again once + * idleness is complete. + */ + +static QEMUTimer *tcg_kick_vcpu_timer; +static CPUState *tcg_current_rr_cpu; + +#define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10) + +static inline int64_t qemu_tcg_next_kick(void) +{ + return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD; +} + +/* Kick the currently round-robin scheduled vCPU to next */ +static void qemu_cpu_kick_rr_next_cpu(void) +{ + CPUState *cpu; + do { + cpu =3D atomic_mb_read(&tcg_current_rr_cpu); + if (cpu) { + cpu_exit(cpu); + } + } while (cpu !=3D atomic_mb_read(&tcg_current_rr_cpu)); +} + +static void kick_tcg_thread(void *opaque) +{ + timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); + qemu_cpu_kick_rr_next_cpu(); +} + +static void start_tcg_kick_timer(void) +{ + assert(!mttcg_enabled); + if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) { + tcg_kick_vcpu_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL, + kick_tcg_thread, NULL); + } + if (tcg_kick_vcpu_timer && !timer_pending(tcg_kick_vcpu_timer)) { + timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); + } +} + +static void stop_tcg_kick_timer(void) +{ + assert(!mttcg_enabled); + if (tcg_kick_vcpu_timer && timer_pending(tcg_kick_vcpu_timer)) { + timer_del(tcg_kick_vcpu_timer); + } +} + +static void qemu_tcg_destroy_vcpu(CPUState *cpu) +{ +} + +static void qemu_tcg_rr_wait_io_event(void) +{ + CPUState *cpu; + + while (all_cpu_threads_idle()) { + stop_tcg_kick_timer(); + qemu_cond_wait_iothread(first_cpu->halt_cond); + } + + start_tcg_kick_timer(); + + CPU_FOREACH(cpu) { + qemu_wait_io_event_common(cpu); + } +} + +static int64_t tcg_get_icount_limit(void) +{ + int64_t deadline; + + if (replay_mode !=3D REPLAY_MODE_PLAY) { + /* + * Include all the timers, because they may need an attention. + * Too long CPU execution may create unnecessary delay in UI. + */ + deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, + QEMU_TIMER_ATTR_ALL); + /* Check realtime timers, because they help with input processing = */ + deadline =3D qemu_soonest_timeout(deadline, + qemu_clock_deadline_ns_all(QEMU_CLOCK_REALTIME, + QEMU_TIMER_ATTR_ALL)); + + /* + * Maintain prior (possibly buggy) behaviour where if no deadline + * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more= than + * INT32_MAX nanoseconds ahead, we still use INT32_MAX + * nanoseconds. + */ + if ((deadline < 0) || (deadline > INT32_MAX)) { + deadline =3D INT32_MAX; + } + + return icount_round(deadline); + } else { + return replay_get_instructions(); + } +} + +static void handle_icount_deadline(void) +{ + assert(qemu_in_vcpu_thread()); + if (icount_enabled()) { + int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, + QEMU_TIMER_ATTR_ALL); + + if (deadline =3D=3D 0) { + /* Wake up other AioContexts. */ + qemu_clock_notify(QEMU_CLOCK_VIRTUAL); + qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL); + } + } +} + +static void prepare_icount_for_run(CPUState *cpu) +{ + if (icount_enabled()) { + int insns_left; + + /* + * These should always be cleared by process_icount_data after + * each vCPU execution. However u16.high can be raised + * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt + */ + g_assert(cpu_neg(cpu)->icount_decr.u16.low =3D=3D 0); + g_assert(cpu->icount_extra =3D=3D 0); + + cpu->icount_budget =3D tcg_get_icount_limit(); + insns_left =3D MIN(0xffff, cpu->icount_budget); + cpu_neg(cpu)->icount_decr.u16.low =3D insns_left; + cpu->icount_extra =3D cpu->icount_budget - insns_left; + + replay_mutex_lock(); + } +} + +static void process_icount_data(CPUState *cpu) +{ + if (icount_enabled()) { + /* Account for executed instructions */ + icount_update(cpu); + + /* Reset the counters */ + cpu_neg(cpu)->icount_decr.u16.low =3D 0; + cpu->icount_extra =3D 0; + cpu->icount_budget =3D 0; + + replay_account_executed_instructions(); + + replay_mutex_unlock(); + } +} + +static int tcg_cpu_exec(CPUState *cpu) +{ + int ret; +#ifdef CONFIG_PROFILER + int64_t ti; +#endif + + assert(tcg_enabled()); +#ifdef CONFIG_PROFILER + ti =3D profile_getclock(); +#endif + cpu_exec_start(cpu); + ret =3D cpu_exec(cpu); + cpu_exec_end(cpu); +#ifdef CONFIG_PROFILER + atomic_set(&tcg_ctx->prof.cpu_exec_time, + tcg_ctx->prof.cpu_exec_time + profile_getclock() - ti); +#endif + return ret; +} + +/* + * Destroy any remaining vCPUs which have been unplugged and have + * finished running + */ +static void deal_with_unplugged_cpus(void) +{ + CPUState *cpu; + + CPU_FOREACH(cpu) { + if (cpu->unplug && !cpu_can_run(cpu)) { + qemu_tcg_destroy_vcpu(cpu); + cpu_thread_signal_destroyed(cpu); + break; + } + } +} + +/* + * Single-threaded TCG + * + * In the single-threaded case each vCPU is simulated in turn. If + * there is more than a single vCPU we create a simple timer to kick + * the vCPU and ensure we don't get stuck in a tight loop in one vCPU. + * This is done explicitly rather than relying on side-effects + * elsewhere. + */ + +static void *tcg_rr_cpu_thread_fn(void *arg) +{ + CPUState *cpu =3D arg; + + assert(tcg_enabled()); + rcu_register_thread(); + tcg_register_thread(); + + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + + cpu->thread_id =3D qemu_get_thread_id(); + cpu->can_do_io =3D 1; + cpu_thread_signal_created(cpu); + qemu_guest_random_seed_thread_part2(cpu->random_seed); + + /* wait for initial kick-off after machine start */ + while (first_cpu->stopped) { + qemu_cond_wait_iothread(first_cpu->halt_cond); + + /* process any pending work */ + CPU_FOREACH(cpu) { + current_cpu =3D cpu; + qemu_wait_io_event_common(cpu); + } + } + + start_tcg_kick_timer(); + + cpu =3D first_cpu; + + /* process any pending work */ + cpu->exit_request =3D 1; + + while (1) { + qemu_mutex_unlock_iothread(); + replay_mutex_lock(); + qemu_mutex_lock_iothread(); + /* Account partial waits to QEMU_CLOCK_VIRTUAL. */ + icount_account_warp_timer(); + + /* + * Run the timers here. This is much more efficient than + * waking up the I/O thread and waiting for completion. + */ + handle_icount_deadline(); + + replay_mutex_unlock(); + + if (!cpu) { + cpu =3D first_cpu; + } + + while (cpu && !cpu->queued_work_first && !cpu->exit_request) { + + atomic_mb_set(&tcg_current_rr_cpu, cpu); + current_cpu =3D cpu; + + qemu_clock_enable(QEMU_CLOCK_VIRTUAL, + (cpu->singlestep_enabled & SSTEP_NOTIMER) = =3D=3D 0); + + if (cpu_can_run(cpu)) { + int r; + + qemu_mutex_unlock_iothread(); + prepare_icount_for_run(cpu); + + r =3D tcg_cpu_exec(cpu); + + process_icount_data(cpu); + qemu_mutex_lock_iothread(); + + if (r =3D=3D EXCP_DEBUG) { + cpu_handle_guest_debug(cpu); + break; + } else if (r =3D=3D EXCP_ATOMIC) { + qemu_mutex_unlock_iothread(); + cpu_exec_step_atomic(cpu); + qemu_mutex_lock_iothread(); + break; + } + } else if (cpu->stop) { + if (cpu->unplug) { + cpu =3D CPU_NEXT(cpu); + } + break; + } + + cpu =3D CPU_NEXT(cpu); + } /* while (cpu && !cpu->exit_request).. */ + + /* Does not need atomic_mb_set because a spurious wakeup is okay. = */ + atomic_set(&tcg_current_rr_cpu, NULL); + + if (cpu && cpu->exit_request) { + atomic_mb_set(&cpu->exit_request, 0); + } + + if (icount_enabled() && all_cpu_threads_idle()) { + /* + * When all cpus are sleeping (e.g in WFI), to avoid a deadlock + * in the main_loop, wake it up in order to start the warp tim= er. + */ + qemu_notify_event(); + } + + qemu_tcg_rr_wait_io_event(); + deal_with_unplugged_cpus(); + } + + rcu_unregister_thread(); + return NULL; +} + +/* + * Multi-threaded TCG + * + * In the multi-threaded case each vCPU has its own thread. The TLS + * variable current_cpu can be used deep in the code to find the + * current CPUState for a given thread. + */ + +static void *tcg_cpu_thread_fn(void *arg) +{ + CPUState *cpu =3D arg; + + assert(tcg_enabled()); + g_assert(!icount_enabled()); + + rcu_register_thread(); + tcg_register_thread(); + + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + + cpu->thread_id =3D qemu_get_thread_id(); + cpu->can_do_io =3D 1; + current_cpu =3D cpu; + cpu_thread_signal_created(cpu); + qemu_guest_random_seed_thread_part2(cpu->random_seed); + + /* process any pending work */ + cpu->exit_request =3D 1; + + do { + if (cpu_can_run(cpu)) { + int r; + qemu_mutex_unlock_iothread(); + r =3D tcg_cpu_exec(cpu); + qemu_mutex_lock_iothread(); + switch (r) { + case EXCP_DEBUG: + cpu_handle_guest_debug(cpu); + break; + case EXCP_HALTED: + /* + * during start-up the vCPU is reset and the thread is + * kicked several times. If we don't ensure we go back + * to sleep in the halted state we won't cleanly + * start-up when the vCPU is enabled. + * + * cpu->halted should ensure we sleep in wait_io_event + */ + g_assert(cpu->halted); + break; + case EXCP_ATOMIC: + qemu_mutex_unlock_iothread(); + cpu_exec_step_atomic(cpu); + qemu_mutex_lock_iothread(); + default: + /* Ignore everything else? */ + break; + } + } + + atomic_mb_set(&cpu->exit_request, 0); + qemu_wait_io_event(cpu); + } while (!cpu->unplug || cpu_can_run(cpu)); + + qemu_tcg_destroy_vcpu(cpu); + cpu_thread_signal_destroyed(cpu); + qemu_mutex_unlock_iothread(); + rcu_unregister_thread(); + return NULL; +} + +static void tcg_start_vcpu_thread(CPUState *cpu) +{ + char thread_name[VCPU_THREAD_NAME_SIZE]; + static QemuCond *single_tcg_halt_cond; + static QemuThread *single_tcg_cpu_thread; + static int tcg_region_inited; + + assert(tcg_enabled()); + /* + * Initialize TCG regions--once. Now is a good time, because: + * (1) TCG's init context, prologue and target globals have been set u= p. + * (2) qemu_tcg_mttcg_enabled() works now (TCG init code runs before t= he + * -accel flag is processed, so the check doesn't work then). + */ + if (!tcg_region_inited) { + tcg_region_inited =3D 1; + tcg_region_init(); + } + + if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) { + cpu->thread =3D g_malloc0(sizeof(QemuThread)); + cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); + qemu_cond_init(cpu->halt_cond); + + if (qemu_tcg_mttcg_enabled()) { + /* create a thread per vCPU with TCG (MTTCG) */ + parallel_cpus =3D true; + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG", + cpu->cpu_index); + + qemu_thread_create(cpu->thread, thread_name, tcg_cpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); + + } else { + /* share a single thread for all cpus with TCG */ + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG"); + qemu_thread_create(cpu->thread, thread_name, + tcg_rr_cpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); + + single_tcg_halt_cond =3D cpu->halt_cond; + single_tcg_cpu_thread =3D cpu->thread; + } +#ifdef _WIN32 + cpu->hThread =3D qemu_thread_get_handle(cpu->thread); +#endif + } else { + /* For non-MTTCG cases we share the thread */ + cpu->thread =3D single_tcg_cpu_thread; + cpu->halt_cond =3D single_tcg_halt_cond; + cpu->thread_id =3D first_cpu->thread_id; + cpu->can_do_io =3D 1; + cpu->created =3D true; + } +} + +static void tcg_cpu_synchronize_noop(CPUState *cpu) +{ +} + +CpusAccelInterface tcg_cpus_interface =3D { + .create_vcpu_thread =3D tcg_start_vcpu_thread, + .kick_vcpu_thread =3D tcg_kick_vcpu_thread, + + .cpu_synchronize_post_reset =3D tcg_cpu_synchronize_noop, + .cpu_synchronize_post_init =3D tcg_cpu_synchronize_noop, + .cpu_synchronize_state =3D tcg_cpu_synchronize_noop, + .cpu_synchronize_pre_loadvm =3D tcg_cpu_synchronize_noop, +}; diff --git a/accel/tcg/tcg-cpus-interface.h b/accel/tcg/tcg-cpus-interface.h new file mode 100644 index 0000000000..c6e96b2af4 --- /dev/null +++ b/accel/tcg/tcg-cpus-interface.h @@ -0,0 +1,8 @@ +#ifndef TCG_CPUS_INTERFACE_H +#define TCG_CPUS_INTERFACE_H + +#include "sysemu/cpus.h" + +extern CpusAccelInterface tcg_cpus_interface; + +#endif /* TCG_CPUS_INTERFACE */ diff --git a/cpus.c b/cpus.c index 7e9f545be8..3f5d34981e 100644 --- a/cpus.c +++ b/cpus.c @@ -24,27 +24,19 @@ =20 #include "qemu/osdep.h" #include "qemu-common.h" -#include "qemu/config-file.h" -#include "qemu/cutils.h" -#include "migration/vmstate.h" #include "monitor/monitor.h" #include "qapi/error.h" #include "qapi/qapi-commands-misc.h" #include "qapi/qapi-events-run-state.h" #include "qapi/qmp/qerror.h" -#include "qemu/error-report.h" -#include "qemu/qemu-print.h" #include "sysemu/tcg.h" -#include "sysemu/block-backend.h" #include "exec/gdbstub.h" -#include "sysemu/dma.h" #include "sysemu/hw_accel.h" #include "sysemu/kvm.h" #include "sysemu/hax.h" #include "sysemu/hvf.h" #include "sysemu/whpx.h" #include "exec/exec-all.h" - #include "qemu/thread.h" #include "qemu/plugin.h" #include "sysemu/cpus.h" @@ -87,7 +79,7 @@ bool cpu_is_stopped(CPUState *cpu) return cpu->stopped || !runstate_is_running(); } =20 -static bool cpu_thread_is_idle(CPUState *cpu) +bool cpu_thread_is_idle(CPUState *cpu) { if (cpu->stop || cpu->queued_work_first) { return false; @@ -114,78 +106,6 @@ bool all_cpu_threads_idle(void) return true; } =20 -bool mttcg_enabled; - -/***********************************************************/ -/* TCG vCPU kick timer - * - * The kick timer is responsible for moving single threaded vCPU - * emulation on to the next vCPU. If more than one vCPU is running a - * timer event with force a cpu->exit so the next vCPU can get - * scheduled. - * - * The timer is removed if all vCPUs are idle and restarted again once - * idleness is complete. - */ - -static QEMUTimer *tcg_kick_vcpu_timer; -static CPUState *tcg_current_rr_cpu; - -#define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10) - -static inline int64_t qemu_tcg_next_kick(void) -{ - return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD; -} - -/* Kick the currently round-robin scheduled vCPU to next */ -static void qemu_cpu_kick_rr_next_cpu(void) -{ - CPUState *cpu; - do { - cpu =3D atomic_mb_read(&tcg_current_rr_cpu); - if (cpu) { - cpu_exit(cpu); - } - } while (cpu !=3D atomic_mb_read(&tcg_current_rr_cpu)); -} - -/* Kick all RR vCPUs */ -static void qemu_cpu_kick_rr_cpus(void) -{ - CPUState *cpu; - - CPU_FOREACH(cpu) { - cpu_exit(cpu); - }; -} - -static void kick_tcg_thread(void *opaque) -{ - timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); - qemu_cpu_kick_rr_next_cpu(); -} - -static void start_tcg_kick_timer(void) -{ - assert(!mttcg_enabled); - if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) { - tcg_kick_vcpu_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL, - kick_tcg_thread, NULL); - } - if (tcg_kick_vcpu_timer && !timer_pending(tcg_kick_vcpu_timer)) { - timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); - } -} - -static void stop_tcg_kick_timer(void) -{ - assert(!mttcg_enabled); - if (tcg_kick_vcpu_timer && timer_pending(tcg_kick_vcpu_timer)) { - timer_del(tcg_kick_vcpu_timer); - } -} - /***********************************************************/ void hw_error(const char *fmt, ...) { @@ -204,16 +124,21 @@ void hw_error(const char *fmt, ...) abort(); } =20 +/* + * every accelerator is supposed to register this. + * Cannot be done cleanly as a machine state or accel class method, + * since TCG is not a normal accelerator yet, + * with USER mode being special-cased and other complications. + */ +static CpusAccelInterface accel_int; + void cpu_synchronize_all_states(void) { CPUState *cpu; =20 CPU_FOREACH(cpu) { - cpu_synchronize_state(cpu); - /* TODO: move to cpu_synchronize_state() */ - if (hvf_enabled()) { - hvf_cpu_synchronize_state(cpu); - } + assert(accel_int.cpu_synchronize_state !=3D NULL); + accel_int.cpu_synchronize_state(cpu); } } =20 @@ -222,11 +147,8 @@ void cpu_synchronize_all_post_reset(void) CPUState *cpu; =20 CPU_FOREACH(cpu) { - cpu_synchronize_post_reset(cpu); - /* TODO: move to cpu_synchronize_post_reset() */ - if (hvf_enabled()) { - hvf_cpu_synchronize_post_reset(cpu); - } + assert(accel_int.cpu_synchronize_post_reset !=3D NULL); + accel_int.cpu_synchronize_post_reset(cpu); } } =20 @@ -235,11 +157,8 @@ void cpu_synchronize_all_post_init(void) CPUState *cpu; =20 CPU_FOREACH(cpu) { - cpu_synchronize_post_init(cpu); - /* TODO: move to cpu_synchronize_post_init() */ - if (hvf_enabled()) { - hvf_cpu_synchronize_post_init(cpu); - } + assert(accel_int.cpu_synchronize_post_init !=3D NULL); + accel_int.cpu_synchronize_post_init(cpu); } } =20 @@ -248,7 +167,8 @@ void cpu_synchronize_all_pre_loadvm(void) CPUState *cpu; =20 CPU_FOREACH(cpu) { - cpu_synchronize_pre_loadvm(cpu); + assert(accel_int.cpu_synchronize_pre_loadvm !=3D NULL); + accel_int.cpu_synchronize_pre_loadvm(cpu); } } =20 @@ -280,7 +200,7 @@ int vm_shutdown(void) return do_vm_stop(RUN_STATE_SHUTDOWN, false); } =20 -static bool cpu_can_run(CPUState *cpu) +bool cpu_can_run(CPUState *cpu) { if (cpu->stop) { return false; @@ -291,7 +211,7 @@ static bool cpu_can_run(CPUState *cpu) return true; } =20 -static void cpu_handle_guest_debug(CPUState *cpu) +void cpu_handle_guest_debug(CPUState *cpu) { gdb_set_stop_cpu(cpu); qemu_system_debug_request(); @@ -374,18 +294,6 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, r= un_on_cpu_data data) do_run_on_cpu(cpu, func, data, &qemu_global_mutex); } =20 -static void qemu_kvm_destroy_vcpu(CPUState *cpu) -{ - if (kvm_destroy_vcpu(cpu) < 0) { - error_report("kvm_destroy_vcpu failed"); - exit(EXIT_FAILURE); - } -} - -static void qemu_tcg_destroy_vcpu(CPUState *cpu) -{ -} - static void qemu_cpu_stop(CPUState *cpu, bool exit) { g_assert(qemu_cpu_is_self(cpu)); @@ -397,7 +305,7 @@ static void qemu_cpu_stop(CPUState *cpu, bool exit) qemu_cond_broadcast(&qemu_pause_cond); } =20 -static void qemu_wait_io_event_common(CPUState *cpu) +void qemu_wait_io_event_common(CPUState *cpu) { atomic_mb_set(&cpu->thread_kicked, false); if (cpu->stop) { @@ -406,23 +314,7 @@ static void qemu_wait_io_event_common(CPUState *cpu) process_queued_cpu_work(cpu); } =20 -static void qemu_tcg_rr_wait_io_event(void) -{ - CPUState *cpu; - - while (all_cpu_threads_idle()) { - stop_tcg_kick_timer(); - qemu_cond_wait(first_cpu->halt_cond, &qemu_global_mutex); - } - - start_tcg_kick_timer(); - - CPU_FOREACH(cpu) { - qemu_wait_io_event_common(cpu); - } -} - -static void qemu_wait_io_event(CPUState *cpu) +void qemu_wait_io_event(CPUState *cpu) { bool slept =3D false; =20 @@ -438,7 +330,8 @@ static void qemu_wait_io_event(CPUState *cpu) } =20 #ifdef _WIN32 - /* Eat dummy APC queued by qemu_cpu_kick_thread. */ + /* Eat dummy APC queued by hax_kick_vcpu_thread */ + /* NB!!! Should not this be if (hax_enabled)? Is this wrong for whpx? = */ if (!tcg_enabled()) { SleepEx(0, TRUE); } @@ -446,540 +339,7 @@ static void qemu_wait_io_event(CPUState *cpu) qemu_wait_io_event_common(cpu); } =20 -static void *qemu_kvm_cpu_thread_fn(void *arg) -{ - CPUState *cpu =3D arg; - int r; - - rcu_register_thread(); - - qemu_mutex_lock_iothread(); - qemu_thread_get_self(cpu->thread); - cpu->thread_id =3D qemu_get_thread_id(); - cpu->can_do_io =3D 1; - current_cpu =3D cpu; - - r =3D kvm_init_vcpu(cpu); - if (r < 0) { - error_report("kvm_init_vcpu failed: %s", strerror(-r)); - exit(1); - } - - kvm_init_cpu_signals(cpu); - - /* signal CPU creation */ - cpu->created =3D true; - qemu_cond_signal(&qemu_cpu_cond); - qemu_guest_random_seed_thread_part2(cpu->random_seed); - - do { - if (cpu_can_run(cpu)) { - r =3D kvm_cpu_exec(cpu); - if (r =3D=3D EXCP_DEBUG) { - cpu_handle_guest_debug(cpu); - } - } - qemu_wait_io_event(cpu); - } while (!cpu->unplug || cpu_can_run(cpu)); - - qemu_kvm_destroy_vcpu(cpu); - cpu->created =3D false; - qemu_cond_signal(&qemu_cpu_cond); - qemu_mutex_unlock_iothread(); - rcu_unregister_thread(); - return NULL; -} - -static void *qemu_dummy_cpu_thread_fn(void *arg) -{ -#ifdef _WIN32 - error_report("qtest is not supported under Windows"); - exit(1); -#else - CPUState *cpu =3D arg; - sigset_t waitset; - int r; - - rcu_register_thread(); - - qemu_mutex_lock_iothread(); - qemu_thread_get_self(cpu->thread); - cpu->thread_id =3D qemu_get_thread_id(); - cpu->can_do_io =3D 1; - current_cpu =3D cpu; - - sigemptyset(&waitset); - sigaddset(&waitset, SIG_IPI); - - /* signal CPU creation */ - cpu->created =3D true; - qemu_cond_signal(&qemu_cpu_cond); - qemu_guest_random_seed_thread_part2(cpu->random_seed); - - do { - qemu_mutex_unlock_iothread(); - do { - int sig; - r =3D sigwait(&waitset, &sig); - } while (r =3D=3D -1 && (errno =3D=3D EAGAIN || errno =3D=3D EINTR= )); - if (r =3D=3D -1) { - perror("sigwait"); - exit(1); - } - qemu_mutex_lock_iothread(); - qemu_wait_io_event(cpu); - } while (!cpu->unplug); - - qemu_mutex_unlock_iothread(); - rcu_unregister_thread(); - return NULL; -#endif -} - -static int64_t tcg_get_icount_limit(void) -{ - int64_t deadline; - - if (replay_mode !=3D REPLAY_MODE_PLAY) { - /* - * Include all the timers, because they may need an attention. - * Too long CPU execution may create unnecessary delay in UI. - */ - deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, - QEMU_TIMER_ATTR_ALL); - /* Check realtime timers, because they help with input processing = */ - deadline =3D qemu_soonest_timeout(deadline, - qemu_clock_deadline_ns_all(QEMU_CLOCK_REALTIME, - QEMU_TIMER_ATTR_ALL)); - - /* Maintain prior (possibly buggy) behaviour where if no deadline - * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more= than - * INT32_MAX nanoseconds ahead, we still use INT32_MAX - * nanoseconds. - */ - if ((deadline < 0) || (deadline > INT32_MAX)) { - deadline =3D INT32_MAX; - } - - return icount_round(deadline); - } else { - return replay_get_instructions(); - } -} - -static void handle_icount_deadline(void) -{ - assert(qemu_in_vcpu_thread()); - if (icount_enabled()) { - int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, - QEMU_TIMER_ATTR_ALL); - - if (deadline =3D=3D 0) { - /* Wake up other AioContexts. */ - qemu_clock_notify(QEMU_CLOCK_VIRTUAL); - qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL); - } - } -} - -static void prepare_icount_for_run(CPUState *cpu) -{ - if (icount_enabled()) { - int insns_left; - - /* These should always be cleared by process_icount_data after - * each vCPU execution. However u16.high can be raised - * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt - */ - g_assert(cpu_neg(cpu)->icount_decr.u16.low =3D=3D 0); - g_assert(cpu->icount_extra =3D=3D 0); - - cpu->icount_budget =3D tcg_get_icount_limit(); - insns_left =3D MIN(0xffff, cpu->icount_budget); - cpu_neg(cpu)->icount_decr.u16.low =3D insns_left; - cpu->icount_extra =3D cpu->icount_budget - insns_left; - - replay_mutex_lock(); - } -} - -static void process_icount_data(CPUState *cpu) -{ - if (icount_enabled()) { - /* Account for executed instructions */ - icount_update(cpu); - - /* Reset the counters */ - cpu_neg(cpu)->icount_decr.u16.low =3D 0; - cpu->icount_extra =3D 0; - cpu->icount_budget =3D 0; - - replay_account_executed_instructions(); - - replay_mutex_unlock(); - } -} - - -static int tcg_cpu_exec(CPUState *cpu) -{ - int ret; -#ifdef CONFIG_PROFILER - int64_t ti; -#endif - - assert(tcg_enabled()); -#ifdef CONFIG_PROFILER - ti =3D profile_getclock(); -#endif - cpu_exec_start(cpu); - ret =3D cpu_exec(cpu); - cpu_exec_end(cpu); -#ifdef CONFIG_PROFILER - atomic_set(&tcg_ctx->prof.cpu_exec_time, - tcg_ctx->prof.cpu_exec_time + profile_getclock() - ti); -#endif - return ret; -} - -/* Destroy any remaining vCPUs which have been unplugged and have - * finished running - */ -static void deal_with_unplugged_cpus(void) -{ - CPUState *cpu; - - CPU_FOREACH(cpu) { - if (cpu->unplug && !cpu_can_run(cpu)) { - qemu_tcg_destroy_vcpu(cpu); - cpu->created =3D false; - qemu_cond_signal(&qemu_cpu_cond); - break; - } - } -} - -/* Single-threaded TCG - * - * In the single-threaded case each vCPU is simulated in turn. If - * there is more than a single vCPU we create a simple timer to kick - * the vCPU and ensure we don't get stuck in a tight loop in one vCPU. - * This is done explicitly rather than relying on side-effects - * elsewhere. - */ - -static void *qemu_tcg_rr_cpu_thread_fn(void *arg) -{ - CPUState *cpu =3D arg; - - assert(tcg_enabled()); - rcu_register_thread(); - tcg_register_thread(); - - qemu_mutex_lock_iothread(); - qemu_thread_get_self(cpu->thread); - - cpu->thread_id =3D qemu_get_thread_id(); - cpu->created =3D true; - cpu->can_do_io =3D 1; - qemu_cond_signal(&qemu_cpu_cond); - qemu_guest_random_seed_thread_part2(cpu->random_seed); - - /* wait for initial kick-off after machine start */ - while (first_cpu->stopped) { - qemu_cond_wait(first_cpu->halt_cond, &qemu_global_mutex); - - /* process any pending work */ - CPU_FOREACH(cpu) { - current_cpu =3D cpu; - qemu_wait_io_event_common(cpu); - } - } - - start_tcg_kick_timer(); - - cpu =3D first_cpu; - - /* process any pending work */ - cpu->exit_request =3D 1; - - while (1) { - qemu_mutex_unlock_iothread(); - replay_mutex_lock(); - qemu_mutex_lock_iothread(); - /* Account partial waits to QEMU_CLOCK_VIRTUAL. */ - icount_account_warp_timer(); - - /* Run the timers here. This is much more efficient than - * waking up the I/O thread and waiting for completion. - */ - handle_icount_deadline(); - - replay_mutex_unlock(); - - if (!cpu) { - cpu =3D first_cpu; - } - - while (cpu && !cpu->queued_work_first && !cpu->exit_request) { - - atomic_mb_set(&tcg_current_rr_cpu, cpu); - current_cpu =3D cpu; - - qemu_clock_enable(QEMU_CLOCK_VIRTUAL, - (cpu->singlestep_enabled & SSTEP_NOTIMER) = =3D=3D 0); - - if (cpu_can_run(cpu)) { - int r; - - qemu_mutex_unlock_iothread(); - prepare_icount_for_run(cpu); - - r =3D tcg_cpu_exec(cpu); - - process_icount_data(cpu); - qemu_mutex_lock_iothread(); - - if (r =3D=3D EXCP_DEBUG) { - cpu_handle_guest_debug(cpu); - break; - } else if (r =3D=3D EXCP_ATOMIC) { - qemu_mutex_unlock_iothread(); - cpu_exec_step_atomic(cpu); - qemu_mutex_lock_iothread(); - break; - } - } else if (cpu->stop) { - if (cpu->unplug) { - cpu =3D CPU_NEXT(cpu); - } - break; - } - - cpu =3D CPU_NEXT(cpu); - } /* while (cpu && !cpu->exit_request).. */ - - /* Does not need atomic_mb_set because a spurious wakeup is okay. = */ - atomic_set(&tcg_current_rr_cpu, NULL); - - if (cpu && cpu->exit_request) { - atomic_mb_set(&cpu->exit_request, 0); - } - - if (icount_enabled() && all_cpu_threads_idle()) { - /* - * When all cpus are sleeping (e.g in WFI), to avoid a deadlock - * in the main_loop, wake it up in order to start the warp tim= er. - */ - qemu_notify_event(); - } - - qemu_tcg_rr_wait_io_event(); - deal_with_unplugged_cpus(); - } - - rcu_unregister_thread(); - return NULL; -} - -static void *qemu_hax_cpu_thread_fn(void *arg) -{ - CPUState *cpu =3D arg; - int r; - - rcu_register_thread(); - qemu_mutex_lock_iothread(); - qemu_thread_get_self(cpu->thread); - - cpu->thread_id =3D qemu_get_thread_id(); - cpu->created =3D true; - current_cpu =3D cpu; - - hax_init_vcpu(cpu); - qemu_cond_signal(&qemu_cpu_cond); - qemu_guest_random_seed_thread_part2(cpu->random_seed); - - do { - if (cpu_can_run(cpu)) { - r =3D hax_smp_cpu_exec(cpu); - if (r =3D=3D EXCP_DEBUG) { - cpu_handle_guest_debug(cpu); - } - } - - qemu_wait_io_event(cpu); - } while (!cpu->unplug || cpu_can_run(cpu)); - rcu_unregister_thread(); - return NULL; -} - -/* The HVF-specific vCPU thread function. This one should only run when th= e host - * CPU supports the VMX "unrestricted guest" feature. */ -static void *qemu_hvf_cpu_thread_fn(void *arg) -{ - CPUState *cpu =3D arg; - - int r; - - assert(hvf_enabled()); - - rcu_register_thread(); - - qemu_mutex_lock_iothread(); - qemu_thread_get_self(cpu->thread); - - cpu->thread_id =3D qemu_get_thread_id(); - cpu->can_do_io =3D 1; - current_cpu =3D cpu; - - hvf_init_vcpu(cpu); - - /* signal CPU creation */ - cpu->created =3D true; - qemu_cond_signal(&qemu_cpu_cond); - qemu_guest_random_seed_thread_part2(cpu->random_seed); - - do { - if (cpu_can_run(cpu)) { - r =3D hvf_vcpu_exec(cpu); - if (r =3D=3D EXCP_DEBUG) { - cpu_handle_guest_debug(cpu); - } - } - qemu_wait_io_event(cpu); - } while (!cpu->unplug || cpu_can_run(cpu)); - - hvf_vcpu_destroy(cpu); - cpu->created =3D false; - qemu_cond_signal(&qemu_cpu_cond); - qemu_mutex_unlock_iothread(); - rcu_unregister_thread(); - return NULL; -} - -static void *qemu_whpx_cpu_thread_fn(void *arg) -{ - CPUState *cpu =3D arg; - int r; - - rcu_register_thread(); - - qemu_mutex_lock_iothread(); - qemu_thread_get_self(cpu->thread); - cpu->thread_id =3D qemu_get_thread_id(); - current_cpu =3D cpu; - - r =3D whpx_init_vcpu(cpu); - if (r < 0) { - fprintf(stderr, "whpx_init_vcpu failed: %s\n", strerror(-r)); - exit(1); - } - - /* signal CPU creation */ - cpu->created =3D true; - qemu_cond_signal(&qemu_cpu_cond); - qemu_guest_random_seed_thread_part2(cpu->random_seed); - - do { - if (cpu_can_run(cpu)) { - r =3D whpx_vcpu_exec(cpu); - if (r =3D=3D EXCP_DEBUG) { - cpu_handle_guest_debug(cpu); - } - } - while (cpu_thread_is_idle(cpu)) { - qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex); - } - qemu_wait_io_event_common(cpu); - } while (!cpu->unplug || cpu_can_run(cpu)); - - whpx_destroy_vcpu(cpu); - cpu->created =3D false; - qemu_cond_signal(&qemu_cpu_cond); - qemu_mutex_unlock_iothread(); - rcu_unregister_thread(); - return NULL; -} - -#ifdef _WIN32 -static void CALLBACK dummy_apc_func(ULONG_PTR unused) -{ -} -#endif - -/* Multi-threaded TCG - * - * In the multi-threaded case each vCPU has its own thread. The TLS - * variable current_cpu can be used deep in the code to find the - * current CPUState for a given thread. - */ - -static void *qemu_tcg_cpu_thread_fn(void *arg) -{ - CPUState *cpu =3D arg; - - assert(tcg_enabled()); - g_assert(!icount_enabled()); - - rcu_register_thread(); - tcg_register_thread(); - - qemu_mutex_lock_iothread(); - qemu_thread_get_self(cpu->thread); - - cpu->thread_id =3D qemu_get_thread_id(); - cpu->created =3D true; - cpu->can_do_io =3D 1; - current_cpu =3D cpu; - qemu_cond_signal(&qemu_cpu_cond); - qemu_guest_random_seed_thread_part2(cpu->random_seed); - - /* process any pending work */ - cpu->exit_request =3D 1; - - do { - if (cpu_can_run(cpu)) { - int r; - qemu_mutex_unlock_iothread(); - r =3D tcg_cpu_exec(cpu); - qemu_mutex_lock_iothread(); - switch (r) { - case EXCP_DEBUG: - cpu_handle_guest_debug(cpu); - break; - case EXCP_HALTED: - /* during start-up the vCPU is reset and the thread is - * kicked several times. If we don't ensure we go back - * to sleep in the halted state we won't cleanly - * start-up when the vCPU is enabled. - * - * cpu->halted should ensure we sleep in wait_io_event - */ - g_assert(cpu->halted); - break; - case EXCP_ATOMIC: - qemu_mutex_unlock_iothread(); - cpu_exec_step_atomic(cpu); - qemu_mutex_lock_iothread(); - default: - /* Ignore everything else? */ - break; - } - } - - atomic_mb_set(&cpu->exit_request, 0); - qemu_wait_io_event(cpu); - } while (!cpu->unplug || cpu_can_run(cpu)); - - qemu_tcg_destroy_vcpu(cpu); - cpu->created =3D false; - qemu_cond_signal(&qemu_cpu_cond); - qemu_mutex_unlock_iothread(); - rcu_unregister_thread(); - return NULL; -} - -static void qemu_cpu_kick_thread(CPUState *cpu) +void cpus_kick_thread(CPUState *cpu) { #ifndef _WIN32 int err; @@ -993,44 +353,20 @@ static void qemu_cpu_kick_thread(CPUState *cpu) fprintf(stderr, "qemu:%s: %s", __func__, strerror(err)); exit(1); } -#else /* _WIN32 */ - if (!qemu_cpu_is_self(cpu)) { - if (whpx_enabled()) { - whpx_vcpu_kick(cpu); - } else if (!QueueUserAPC(dummy_apc_func, cpu->hThread, 0)) { - fprintf(stderr, "%s: QueueUserAPC failed with error %lu\n", - __func__, GetLastError()); - exit(1); - } - } #endif } =20 void qemu_cpu_kick(CPUState *cpu) { qemu_cond_broadcast(cpu->halt_cond); - if (tcg_enabled()) { - if (qemu_tcg_mttcg_enabled()) { - cpu_exit(cpu); - } else { - qemu_cpu_kick_rr_cpus(); - } - } else { - if (hax_enabled()) { - /* - * FIXME: race condition with the exit_request check in - * hax_vcpu_hax_exec - */ - cpu->exit_request =3D 1; - } - qemu_cpu_kick_thread(cpu); - } + assert(accel_int.kick_vcpu_thread !=3D NULL); + accel_int.kick_vcpu_thread(cpu); } =20 void qemu_cpu_kick_self(void) { assert(current_cpu); - qemu_cpu_kick_thread(current_cpu); + cpus_kick_thread(current_cpu); } =20 bool qemu_cpu_is_self(CPUState *cpu) @@ -1080,6 +416,21 @@ void qemu_cond_timedwait_iothread(QemuCond *cond, int= ms) qemu_cond_timedwait(cond, &qemu_global_mutex, ms); } =20 +/* signal CPU creation */ +void cpu_thread_signal_created(CPUState *cpu) +{ + cpu->created =3D true; + qemu_cond_signal(&qemu_cpu_cond); +} + +/* signal CPU destruction */ +void cpu_thread_signal_destroyed(CPUState *cpu) +{ + cpu->created =3D false; + qemu_cond_signal(&qemu_cpu_cond); +} + + static bool all_vcpus_paused(void) { CPUState *cpu; @@ -1155,140 +506,18 @@ void cpu_remove_sync(CPUState *cpu) qemu_mutex_lock_iothread(); } =20 -/* For temporary buffers for forming a name */ -#define VCPU_THREAD_NAME_SIZE 16 - -static void qemu_tcg_init_vcpu(CPUState *cpu) -{ - char thread_name[VCPU_THREAD_NAME_SIZE]; - static QemuCond *single_tcg_halt_cond; - static QemuThread *single_tcg_cpu_thread; - static int tcg_region_inited; - - assert(tcg_enabled()); - /* - * Initialize TCG regions--once. Now is a good time, because: - * (1) TCG's init context, prologue and target globals have been set u= p. - * (2) qemu_tcg_mttcg_enabled() works now (TCG init code runs before t= he - * -accel flag is processed, so the check doesn't work then). - */ - if (!tcg_region_inited) { - tcg_region_inited =3D 1; - tcg_region_init(); - } - - if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) { - cpu->thread =3D g_malloc0(sizeof(QemuThread)); - cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); - qemu_cond_init(cpu->halt_cond); - - if (qemu_tcg_mttcg_enabled()) { - /* create a thread per vCPU with TCG (MTTCG) */ - parallel_cpus =3D true; - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG", - cpu->cpu_index); - - qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thre= ad_fn, - cpu, QEMU_THREAD_JOINABLE); - - } else { - /* share a single thread for all cpus with TCG */ - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG"); - qemu_thread_create(cpu->thread, thread_name, - qemu_tcg_rr_cpu_thread_fn, - cpu, QEMU_THREAD_JOINABLE); - - single_tcg_halt_cond =3D cpu->halt_cond; - single_tcg_cpu_thread =3D cpu->thread; - } -#ifdef _WIN32 - cpu->hThread =3D qemu_thread_get_handle(cpu->thread); -#endif - } else { - /* For non-MTTCG cases we share the thread */ - cpu->thread =3D single_tcg_cpu_thread; - cpu->halt_cond =3D single_tcg_halt_cond; - cpu->thread_id =3D first_cpu->thread_id; - cpu->can_do_io =3D 1; - cpu->created =3D true; - } -} - -static void qemu_hax_start_vcpu(CPUState *cpu) -{ - char thread_name[VCPU_THREAD_NAME_SIZE]; - - cpu->thread =3D g_malloc0(sizeof(QemuThread)); - cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); - qemu_cond_init(cpu->halt_cond); - - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HAX", - cpu->cpu_index); - qemu_thread_create(cpu->thread, thread_name, qemu_hax_cpu_thread_fn, - cpu, QEMU_THREAD_JOINABLE); -#ifdef _WIN32 - cpu->hThread =3D qemu_thread_get_handle(cpu->thread); -#endif -} - -static void qemu_kvm_start_vcpu(CPUState *cpu) +void cpus_register_accel_interface(CpusAccelInterface *i) { - char thread_name[VCPU_THREAD_NAME_SIZE]; - - cpu->thread =3D g_malloc0(sizeof(QemuThread)); - cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); - qemu_cond_init(cpu->halt_cond); - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM", - cpu->cpu_index); - qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn, - cpu, QEMU_THREAD_JOINABLE); -} - -static void qemu_hvf_start_vcpu(CPUState *cpu) -{ - char thread_name[VCPU_THREAD_NAME_SIZE]; - - /* HVF currently does not support TCG, and only runs in - * unrestricted-guest mode. */ - assert(hvf_enabled()); - - cpu->thread =3D g_malloc0(sizeof(QemuThread)); - cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); - qemu_cond_init(cpu->halt_cond); + assert(i !=3D NULL); + assert(i->create_vcpu_thread !=3D NULL); + assert(i->kick_vcpu_thread !=3D NULL); =20 - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HVF", - cpu->cpu_index); - qemu_thread_create(cpu->thread, thread_name, qemu_hvf_cpu_thread_fn, - cpu, QEMU_THREAD_JOINABLE); -} - -static void qemu_whpx_start_vcpu(CPUState *cpu) -{ - char thread_name[VCPU_THREAD_NAME_SIZE]; - - cpu->thread =3D g_malloc0(sizeof(QemuThread)); - cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); - qemu_cond_init(cpu->halt_cond); - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/WHPX", - cpu->cpu_index); - qemu_thread_create(cpu->thread, thread_name, qemu_whpx_cpu_thread_fn, - cpu, QEMU_THREAD_JOINABLE); -#ifdef _WIN32 - cpu->hThread =3D qemu_thread_get_handle(cpu->thread); -#endif -} + assert(i->cpu_synchronize_post_reset !=3D NULL); + assert(i->cpu_synchronize_post_init !=3D NULL); + assert(i->cpu_synchronize_state !=3D NULL); + assert(i->cpu_synchronize_pre_loadvm !=3D NULL); =20 -static void qemu_dummy_start_vcpu(CPUState *cpu) -{ - char thread_name[VCPU_THREAD_NAME_SIZE]; - - cpu->thread =3D g_malloc0(sizeof(QemuThread)); - cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); - qemu_cond_init(cpu->halt_cond); - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY", - cpu->cpu_index); - qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn,= cpu, - QEMU_THREAD_JOINABLE); + accel_int =3D *i; } =20 void qemu_init_vcpu(CPUState *cpu) @@ -1308,19 +537,8 @@ void qemu_init_vcpu(CPUState *cpu) cpu_address_space_init(cpu, 0, "cpu-memory", cpu->memory); } =20 - if (kvm_enabled()) { - qemu_kvm_start_vcpu(cpu); - } else if (hax_enabled()) { - qemu_hax_start_vcpu(cpu); - } else if (hvf_enabled()) { - qemu_hvf_start_vcpu(cpu); - } else if (tcg_enabled()) { - qemu_tcg_init_vcpu(cpu); - } else if (whpx_enabled()) { - qemu_whpx_start_vcpu(cpu); - } else { - qemu_dummy_start_vcpu(cpu); - } + assert(accel_int.create_vcpu_thread !=3D NULL); + accel_int.create_vcpu_thread(cpu); =20 while (!cpu->created) { qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex); @@ -1498,3 +716,26 @@ void qmp_inject_nmi(Error **errp) nmi_monitor_handle(monitor_get_cpu_index(), errp); } =20 +void cpu_synchronize_state(CPUState *cpu) +{ + assert(accel_int.cpu_synchronize_state !=3D NULL); + accel_int.cpu_synchronize_state(cpu); +} + +void cpu_synchronize_post_reset(CPUState *cpu) +{ + assert(accel_int.cpu_synchronize_post_reset !=3D NULL); + accel_int.cpu_synchronize_post_reset(cpu); +} + +void cpu_synchronize_post_init(CPUState *cpu) +{ + assert(accel_int.cpu_synchronize_post_init !=3D NULL); + accel_int.cpu_synchronize_post_init(cpu); +} + +void cpu_synchronize_pre_loadvm(CPUState *cpu) +{ + assert(accel_int.cpu_synchronize_pre_loadvm !=3D NULL); + accel_int.cpu_synchronize_pre_loadvm(cpu); +} diff --git a/hw/core/cpu.c b/hw/core/cpu.c index 5284d384fb..b601654a10 100644 --- a/hw/core/cpu.c +++ b/hw/core/cpu.c @@ -33,6 +33,7 @@ #include "hw/qdev-properties.h" #include "trace-root.h" #include "qemu/plugin.h" +#include "sysemu/hw_accel.h" =20 CPUInterruptHandler cpu_interrupt_handler; =20 diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h index 149de000a0..b9d76ebda8 100644 --- a/include/sysemu/cpus.h +++ b/include/sysemu/cpus.h @@ -4,7 +4,51 @@ #include "qemu/timer.h" =20 /* cpus.c */ + +/* CPU execution threads */ + +typedef struct CpusAccelInterface { + void (*create_vcpu_thread)(CPUState *cpu); + void (*kick_vcpu_thread)(CPUState *cpu); + + void (*cpu_synchronize_post_reset)(CPUState *cpu); + void (*cpu_synchronize_post_init)(CPUState *cpu); + void (*cpu_synchronize_state)(CPUState *cpu); + void (*cpu_synchronize_pre_loadvm)(CPUState *cpu); +} CpusAccelInterface; + +/* register accel-specific interface */ +void cpus_register_accel_interface(CpusAccelInterface *i); + +/* + * these are the registrable vcpu start functions for all accelerators. + * They are arguments to qemu_register_start_vcpu(); + */ +void qemu_dummy_start_vcpu(CPUState *cpu); +void qemu_tcg_init_vcpu(CPUState *cpu); +void qemu_kvm_start_vcpu(CPUState *cpu); +void qemu_hax_start_vcpu(CPUState *cpu); +void qemu_hvf_start_vcpu(CPUState *cpu); +void qemu_whpx_start_vcpu(CPUState *cpu); +/* end of vcpu start functions for accelerators */ + +/* interface available for cpus accelerator threads */ + +/* For temporary buffers for forming a name */ +#define VCPU_THREAD_NAME_SIZE 16 + +void cpus_kick_thread(CPUState *cpu); +bool cpu_thread_is_idle(CPUState *cpu); bool all_cpu_threads_idle(void); +bool cpu_can_run(CPUState *cpu); +void qemu_wait_io_event_common(CPUState *cpu); +void qemu_wait_io_event(CPUState *cpu); +void cpu_thread_signal_created(CPUState *cpu); +void cpu_thread_signal_destroyed(CPUState *cpu); +void cpu_handle_guest_debug(CPUState *cpu); + +/* end interface for cpus accelerator threads */ + bool qemu_in_vcpu_thread(void); void qemu_init_cpu_loop(void); void resume_all_vcpus(void); diff --git a/include/sysemu/hvf.h b/include/sysemu/hvf.h index d211e808e9..cdd4172b24 100644 --- a/include/sysemu/hvf.h +++ b/include/sysemu/hvf.h @@ -86,7 +86,6 @@ int hvf_smp_cpu_exec(CPUState *); void hvf_cpu_synchronize_state(CPUState *); void hvf_cpu_synchronize_post_reset(CPUState *); void hvf_cpu_synchronize_post_init(CPUState *); -void _hvf_cpu_synchronize_post_init(CPUState *, run_on_cpu_data); =20 void hvf_vcpu_destroy(CPUState *); void hvf_raise_event(CPUState *); diff --git a/include/sysemu/hw_accel.h b/include/sysemu/hw_accel.h index 0ec2372477..336740e10a 100644 --- a/include/sysemu/hw_accel.h +++ b/include/sysemu/hw_accel.h @@ -1,5 +1,5 @@ /* - * QEMU Hardware accelertors support + * QEMU Hardware accelerators support * * Copyright 2016 Google, Inc. * @@ -16,56 +16,9 @@ #include "sysemu/kvm.h" #include "sysemu/whpx.h" =20 -static inline void cpu_synchronize_state(CPUState *cpu) -{ - if (kvm_enabled()) { - kvm_cpu_synchronize_state(cpu); - } - if (hax_enabled()) { - hax_cpu_synchronize_state(cpu); - } - if (whpx_enabled()) { - whpx_cpu_synchronize_state(cpu); - } -} - -static inline void cpu_synchronize_post_reset(CPUState *cpu) -{ - if (kvm_enabled()) { - kvm_cpu_synchronize_post_reset(cpu); - } - if (hax_enabled()) { - hax_cpu_synchronize_post_reset(cpu); - } - if (whpx_enabled()) { - whpx_cpu_synchronize_post_reset(cpu); - } -} - -static inline void cpu_synchronize_post_init(CPUState *cpu) -{ - if (kvm_enabled()) { - kvm_cpu_synchronize_post_init(cpu); - } - if (hax_enabled()) { - hax_cpu_synchronize_post_init(cpu); - } - if (whpx_enabled()) { - whpx_cpu_synchronize_post_init(cpu); - } -} - -static inline void cpu_synchronize_pre_loadvm(CPUState *cpu) -{ - if (kvm_enabled()) { - kvm_cpu_synchronize_pre_loadvm(cpu); - } - if (hax_enabled()) { - hax_cpu_synchronize_pre_loadvm(cpu); - } - if (whpx_enabled()) { - whpx_cpu_synchronize_pre_loadvm(cpu); - } -} +void cpu_synchronize_state(CPUState *cpu); +void cpu_synchronize_post_reset(CPUState *cpu); +void cpu_synchronize_post_init(CPUState *cpu); +void cpu_synchronize_pre_loadvm(CPUState *cpu); =20 #endif /* QEMU_HW_ACCEL_H */ diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 3b2250471c..3d84c940db 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -218,7 +218,7 @@ int kvm_has_intx_set_mask(void); =20 int kvm_init_vcpu(CPUState *cpu); int kvm_cpu_exec(CPUState *cpu); -int kvm_destroy_vcpu(CPUState *cpu); +void kvm_destroy_vcpu(CPUState *cpu); =20 /** * kvm_arm_supports_user_irq diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index 45be5dc0ed..3bbd72b7cd 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -43,4 +43,5 @@ stub-obj-y +=3D pci-host-piix.o stub-obj-y +=3D ram-block.o stub-obj-y +=3D ramfb.o stub-obj-y +=3D fw_cfg.o +stub-obj-y +=3D cpu-synchronize-state.o stub-obj-$(CONFIG_SOFTMMU) +=3D semihost.o diff --git a/stubs/cpu-synchronize-state.c b/stubs/cpu-synchronize-state.c new file mode 100644 index 0000000000..3112fe439d --- /dev/null +++ b/stubs/cpu-synchronize-state.c @@ -0,0 +1,15 @@ +#include "qemu/osdep.h" +#include "sysemu/hw_accel.h" + +void cpu_synchronize_state(CPUState *cpu) +{ +} +void cpu_synchronize_post_reset(CPUState *cpu) +{ +} +void cpu_synchronize_post_init(CPUState *cpu) +{ +} +void cpu_synchronize_pre_loadvm(CPUState *cpu) +{ +} diff --git a/target/i386/Makefile.objs b/target/i386/Makefile.objs index 48e0c28434..a847064e68 100644 --- a/target/i386/Makefile.objs +++ b/target/i386/Makefile.objs @@ -9,14 +9,15 @@ obj-y +=3D machine.o arch_memory_mapping.o arch_dump.o mo= nitor.o obj-$(CONFIG_KVM) +=3D kvm.o obj-$(CONFIG_HYPERV) +=3D hyperv.o obj-$(call lnot,$(CONFIG_HYPERV)) +=3D hyperv-stub.o +obj-$(CONFIG_HAX) +=3D hax-all.o hax-mem.o hax-cpus-interface.o ifeq ($(CONFIG_WIN32),y) -obj-$(CONFIG_HAX) +=3D hax-all.o hax-mem.o hax-windows.o +obj-$(CONFIG_HAX) +=3D hax-windows.o endif ifeq ($(CONFIG_POSIX),y) -obj-$(CONFIG_HAX) +=3D hax-all.o hax-mem.o hax-posix.o +obj-$(CONFIG_HAX) +=3D hax-posix.o endif obj-$(CONFIG_HVF) +=3D hvf/ -obj-$(CONFIG_WHPX) +=3D whpx-all.o +obj-$(CONFIG_WHPX) +=3D whpx-all.o whpx-cpus-interface.o endif obj-$(CONFIG_SEV) +=3D sev.o obj-$(call lnot,$(CONFIG_SEV)) +=3D sev-stub.o diff --git a/target/i386/hax-all.c b/target/i386/hax-all.c index f9c83fff25..56c99bffd2 100644 --- a/target/i386/hax-all.c +++ b/target/i386/hax-all.c @@ -32,9 +32,10 @@ #include "sysemu/accel.h" #include "sysemu/reset.h" #include "sysemu/runstate.h" -#include "qemu/main-loop.h" #include "hw/boards.h" =20 +#include "hax-cpus-interface.h" + #define DEBUG_HAX 0 =20 #define DPRINTF(fmt, ...) \ @@ -361,6 +362,9 @@ static int hax_accel_init(MachineState *ms) !ret ? "working" : "not working", !ret ? "fast virt" : "emulation"); } + if (ret =3D=3D 0) { + cpus_register_accel_interface(&hax_cpus_interface); + } return ret; } =20 diff --git a/target/i386/hax-cpus-interface.c b/target/i386/hax-cpus-interf= ace.c new file mode 100644 index 0000000000..85cbfb4ae8 --- /dev/null +++ b/target/i386/hax-cpus-interface.c @@ -0,0 +1,85 @@ +/* + * QEMU HAX support + * + * Copyright IBM, Corp. 2008 + * Red Hat, Inc. 2008 + * + * Authors: + * Anthony Liguori + * Glauber Costa + * + * Copyright (c) 2011 Intel Corporation + * Written by: + * Jiang Yunhong + * Xin Xiaohui + * Zhang Xiantao + * + * This work is licensed under the terms of the GNU GPL, version 2 or late= r. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "qemu/main-loop.h" +#include "hax-i386.h" +#include "sysemu/runstate.h" +#include "sysemu/cpus.h" +#include "qemu/guest-random.h" + +#include "hax-cpus-interface.h" + +static void *hax_cpu_thread_fn(void *arg) +{ + CPUState *cpu =3D arg; + int r; + + rcu_register_thread(); + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + + cpu->thread_id =3D qemu_get_thread_id(); + hax_init_vcpu(cpu); + cpu_thread_signal_created(cpu); + qemu_guest_random_seed_thread_part2(cpu->random_seed); + + do { + if (cpu_can_run(cpu)) { + r =3D hax_smp_cpu_exec(cpu); + if (r =3D=3D EXCP_DEBUG) { + cpu_handle_guest_debug(cpu); + } + } + + qemu_wait_io_event(cpu); + } while (!cpu->unplug || cpu_can_run(cpu)); + rcu_unregister_thread(); + return NULL; +} + +static void hax_start_vcpu_thread(CPUState *cpu) +{ + char thread_name[VCPU_THREAD_NAME_SIZE]; + + cpu->thread =3D g_malloc0(sizeof(QemuThread)); + cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); + qemu_cond_init(cpu->halt_cond); + + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HAX", + cpu->cpu_index); + qemu_thread_create(cpu->thread, thread_name, hax_cpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); +#ifdef _WIN32 + cpu->hThread =3D qemu_thread_get_handle(cpu->thread); +#endif +} + +CpusAccelInterface hax_cpus_interface =3D { + .create_vcpu_thread =3D hax_start_vcpu_thread, + .kick_vcpu_thread =3D hax_kick_vcpu_thread, + + .cpu_synchronize_post_reset =3D hax_cpu_synchronize_post_reset, + .cpu_synchronize_post_init =3D hax_cpu_synchronize_post_init, + .cpu_synchronize_state =3D hax_cpu_synchronize_state, + .cpu_synchronize_pre_loadvm =3D hax_cpu_synchronize_pre_loadvm, +}; diff --git a/target/i386/hax-cpus-interface.h b/target/i386/hax-cpus-interf= ace.h new file mode 100644 index 0000000000..a06f3df752 --- /dev/null +++ b/target/i386/hax-cpus-interface.h @@ -0,0 +1,8 @@ +#ifndef HAX_CPUS_INTERFACE_H +#define HAX_CPUS_INTERFACE_H + +#include "sysemu/cpus.h" + +extern CpusAccelInterface hax_cpus_interface; + +#endif /* HAX_CPUS_INTERFACE */ diff --git a/target/i386/hax-i386.h b/target/i386/hax-i386.h index 54e9d8b057..667139c7af 100644 --- a/target/i386/hax-i386.h +++ b/target/i386/hax-i386.h @@ -61,6 +61,8 @@ int hax_inject_interrupt(CPUArchState *env, int vector); struct hax_vm *hax_vm_create(struct hax_state *hax); int hax_vcpu_run(struct hax_vcpu_state *vcpu); int hax_vcpu_create(int id); +void hax_kick_vcpu_thread(CPUState *cpu); + int hax_sync_vcpu_state(CPUArchState *env, struct vcpu_state_t *state, int set); int hax_sync_msr(CPUArchState *env, struct hax_msr_data *msrs, int set); diff --git a/target/i386/hax-posix.c b/target/i386/hax-posix.c index 3bad89f133..ea956ddfc1 100644 --- a/target/i386/hax-posix.c +++ b/target/i386/hax-posix.c @@ -16,6 +16,8 @@ =20 #include "target/i386/hax-i386.h" =20 +#include "sysemu/cpus.h" + hax_fd hax_mod_open(void) { int fd =3D open("/dev/HAX", O_RDWR); @@ -292,3 +294,13 @@ int hax_inject_interrupt(CPUArchState *env, int vector) =20 return ioctl(fd, HAX_VCPU_IOCTL_INTERRUPT, &vector); } + +void hax_kick_vcpu_thread(CPUState *cpu) +{ + /* + * FIXME: race condition with the exit_request check in + * hax_vcpu_hax_exec + */ + cpu->exit_request =3D 1; + cpus_kick_thread(cpu); +} diff --git a/target/i386/hax-windows.c b/target/i386/hax-windows.c index 863c2bcc19..469b48e608 100644 --- a/target/i386/hax-windows.c +++ b/target/i386/hax-windows.c @@ -463,3 +463,23 @@ int hax_inject_interrupt(CPUArchState *env, int vector) return 0; } } + +static void CALLBACK dummy_apc_func(ULONG_PTR unused) +{ +} + +void hax_kick_vcpu_thread(CPUState *cpu) +{ + /* + * FIXME: race condition with the exit_request check in + * hax_vcpu_hax_exec + */ + cpu->exit_request =3D 1; + if (!qemu_cpu_is_self(cpu)) { + if (!QueueUserAPC(dummy_apc_func, cpu->hThread, 0)) { + fprintf(stderr, "%s: QueueUserAPC failed with error %lu\n", + __func__, GetLastError()); + exit(1); + } + } +} diff --git a/target/i386/hvf/Makefile.objs b/target/i386/hvf/Makefile.objs index 927b86bc67..bdbc2c0227 100644 --- a/target/i386/hvf/Makefile.objs +++ b/target/i386/hvf/Makefile.objs @@ -1,2 +1,2 @@ -obj-y +=3D hvf.o +obj-y +=3D hvf.o hvf-cpus-interface.o obj-y +=3D x86.o x86_cpuid.o x86_decode.o x86_descr.o x86_emu.o x86_flags.= o x86_mmu.o x86hvf.o x86_task.o diff --git a/target/i386/hvf/hvf-cpus-interface.c b/target/i386/hvf/hvf-cpu= s-interface.c new file mode 100644 index 0000000000..54bfe307c1 --- /dev/null +++ b/target/i386/hvf/hvf-cpus-interface.c @@ -0,0 +1,92 @@ +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "qemu/main-loop.h" +#include "sysemu/hvf.h" +#include "sysemu/runstate.h" +#include "sysemu/cpus.h" +#include "qemu/guest-random.h" + +#include "hvf-cpus-interface.h" + +/* + * The HVF-specific vCPU thread function. This one should only run when th= e host + * CPU supports the VMX "unrestricted guest" feature. + */ +static void *hvf_cpu_thread_fn(void *arg) +{ + CPUState *cpu =3D arg; + + int r; + + assert(hvf_enabled()); + + rcu_register_thread(); + + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + + cpu->thread_id =3D qemu_get_thread_id(); + cpu->can_do_io =3D 1; + current_cpu =3D cpu; + + hvf_init_vcpu(cpu); + + /* signal CPU creation */ + cpu_thread_signal_created(cpu); + qemu_guest_random_seed_thread_part2(cpu->random_seed); + + do { + if (cpu_can_run(cpu)) { + r =3D hvf_vcpu_exec(cpu); + if (r =3D=3D EXCP_DEBUG) { + cpu_handle_guest_debug(cpu); + } + } + qemu_wait_io_event(cpu); + } while (!cpu->unplug || cpu_can_run(cpu)); + + hvf_vcpu_destroy(cpu); + cpu_thread_signal_destroyed(cpu); + qemu_mutex_unlock_iothread(); + rcu_unregister_thread(); + return NULL; +} + +static void hvf_kick_vcpu_thread(CPUState *cpu) +{ + cpus_kick_thread(cpu); +} + +static void hvf_cpu_synchronize_noop(CPUState *cpu) +{ +} + +static void hvf_start_vcpu_thread(CPUState *cpu) +{ + char thread_name[VCPU_THREAD_NAME_SIZE]; + + /* + * HVF currently does not support TCG, and only runs in + * unrestricted-guest mode. + */ + assert(hvf_enabled()); + + cpu->thread =3D g_malloc0(sizeof(QemuThread)); + cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); + qemu_cond_init(cpu->halt_cond); + + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HVF", + cpu->cpu_index); + qemu_thread_create(cpu->thread, thread_name, hvf_cpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); +} + +CpusAccelInterface hvf_cpus_interface =3D { + .create_vcpu_thread =3D hvf_start_vcpu_thread, + .kick_vcpu_thread =3D hvf_kick_vcpu_thread, + + .cpu_synchronize_post_reset =3D hvf_cpu_synchronize_noop, + .cpu_synchronize_post_init =3D hvf_cpu_synchronize_noop, + .cpu_synchronize_state =3D hvf_cpu_synchronize_noop, + .cpu_synchronize_pre_loadvm =3D hvf_cpu_synchronize_noop, +}; diff --git a/target/i386/hvf/hvf-cpus-interface.h b/target/i386/hvf/hvf-cpu= s-interface.h new file mode 100644 index 0000000000..6ea38742e5 --- /dev/null +++ b/target/i386/hvf/hvf-cpus-interface.h @@ -0,0 +1,8 @@ +#ifndef HVF_CPUS_INTERFACE_H +#define HVF_CPUS_INTERFACE_H + +#include "sysemu/cpus.h" + +extern CpusAccelInterface hvf_cpus_interface; + +#endif /* HVF_CPUS_INTERFACE */ diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c index d72543dc31..75cdf88cec 100644 --- a/target/i386/hvf/hvf.c +++ b/target/i386/hvf/hvf.c @@ -72,6 +72,8 @@ #include "sysemu/accel.h" #include "target/i386/cpu.h" =20 +#include "hvf-cpus-interface.h" + HVFState *hvf_state; =20 static void assert_hvf_ok(hv_return_t ret) @@ -312,7 +314,7 @@ void hvf_cpu_synchronize_post_reset(CPUState *cpu_state) run_on_cpu(cpu_state, do_hvf_cpu_synchronize_post_reset, RUN_ON_CPU_NU= LL); } =20 -void _hvf_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg) +static void _hvf_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data = arg) { CPUState *cpu_state =3D cpu; hvf_put_registers(cpu_state); @@ -979,6 +981,7 @@ static int hvf_accel_init(MachineState *ms) hvf_state =3D s; cpu_interrupt_handler =3D hvf_handle_interrupt; memory_listener_register(&hvf_memory_listener, &address_space_memory); + cpus_register_accel_interface(&hvf_cpus_interface); return 0; } =20 diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c index c78baac6df..daac9858ae 100644 --- a/target/i386/whpx-all.c +++ b/target/i386/whpx-all.c @@ -24,6 +24,8 @@ #include "migration/blocker.h" #include "whp-dispatch.h" =20 +#include "whpx-cpus-interface.h" + #include #include =20 @@ -1575,6 +1577,7 @@ static int whpx_accel_init(MachineState *ms) whpx_memory_init(); =20 cpu_interrupt_handler =3D whpx_handle_interrupt; + cpus_register_accel_interface(&whpx_cpus_interface); =20 printf("Windows Hypervisor Platform accelerator is operational\n"); return 0; diff --git a/target/i386/whpx-cpus-interface.c b/target/i386/whpx-cpus-inte= rface.c new file mode 100644 index 0000000000..d2a4ef0699 --- /dev/null +++ b/target/i386/whpx-cpus-interface.c @@ -0,0 +1,96 @@ +/* + * QEMU Windows Hypervisor Platform accelerator (WHPX) + * + * Copyright Microsoft Corp. 2017 + * + * This work is licensed under the terms of the GNU GPL, version 2 or late= r. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "sysemu/kvm_int.h" +#include "qemu/main-loop.h" +#include "sysemu/cpus.h" +#include "qemu/guest-random.h" + +#include "sysemu/whpx.h" +#include "whpx-cpus-interface.h" + +#include +#include + +static void *whpx_cpu_thread_fn(void *arg) +{ + CPUState *cpu =3D arg; + int r; + + rcu_register_thread(); + + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + cpu->thread_id =3D qemu_get_thread_id(); + current_cpu =3D cpu; + + r =3D whpx_init_vcpu(cpu); + if (r < 0) { + fprintf(stderr, "whpx_init_vcpu failed: %s\n", strerror(-r)); + exit(1); + } + + /* signal CPU creation */ + cpu_thread_signal_created(cpu); + qemu_guest_random_seed_thread_part2(cpu->random_seed); + + do { + if (cpu_can_run(cpu)) { + r =3D whpx_vcpu_exec(cpu); + if (r =3D=3D EXCP_DEBUG) { + cpu_handle_guest_debug(cpu); + } + } + while (cpu_thread_is_idle(cpu)) { + qemu_cond_wait_iothread(cpu->halt_cond); + } + qemu_wait_io_event_common(cpu); + } while (!cpu->unplug || cpu_can_run(cpu)); + + whpx_destroy_vcpu(cpu); + cpu_thread_signal_destroyed(cpu); + qemu_mutex_unlock_iothread(); + rcu_unregister_thread(); + return NULL; +} + +static void whpx_start_vcpu_thread(CPUState *cpu) +{ + char thread_name[VCPU_THREAD_NAME_SIZE]; + + cpu->thread =3D g_malloc0(sizeof(QemuThread)); + cpu->halt_cond =3D g_malloc0(sizeof(QemuCond)); + qemu_cond_init(cpu->halt_cond); + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/WHPX", + cpu->cpu_index); + qemu_thread_create(cpu->thread, thread_name, whpx_cpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); +#ifdef _WIN32 + cpu->hThread =3D qemu_thread_get_handle(cpu->thread); +#endif +} + +static void whpx_kick_vcpu_thread(CPUState *cpu) +{ + if (!qemu_cpu_is_self(cpu)) { + whpx_vcpu_kick(cpu); + } +} + +CpusAccelInterface whpx_cpus_interface =3D { + .create_vcpu_thread =3D whpx_start_vcpu_thread, + .kick_vcpu_thread =3D whpx_kick_vcpu_thread, + + .cpu_synchronize_post_reset =3D whpx_cpu_synchronize_post_reset, + .cpu_synchronize_post_init =3D whpx_cpu_synchronize_post_init, + .cpu_synchronize_state =3D whpx_cpu_synchronize_state, + .cpu_synchronize_pre_loadvm =3D whpx_cpu_synchronize_pre_loadvm, +}; diff --git a/target/i386/whpx-cpus-interface.h b/target/i386/whpx-cpus-inte= rface.h new file mode 100644 index 0000000000..084e8b15b8 --- /dev/null +++ b/target/i386/whpx-cpus-interface.h @@ -0,0 +1,8 @@ +#ifndef WHPX_CPUS_INTERFACE_H +#define WHPX_CPUS_INTERFACE_H + +#include "sysemu/cpus.h" + +extern CpusAccelInterface whpx_cpus_interface; + +#endif /* WHPX_CPUS_INTERFACE */ --=20 2.16.4