From nobody Mon Feb  9 04:55:47 2026
Delivered-To: importer@patchew.org
Authentication-Results: mx.zohomail.com;
	spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as
 permitted sender)
  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
ARC-Seal: i=1; a=rsa-sha256; t=1590087342; cv=none;
	d=zohomail.com; s=zohoarc;
	b=e0pT9cfdpmcqnaZaSOERwEXLEF3p9zswcmelNca47+NsvCX4lo+WpmpyrmfxSqdMcPg0IvwelKHAyy0IebbO2kOiPwhHSdRKN1liErKYDwI9rxkzvAqeKfnYf2rdBV/ugWEc0eh4ackRws7yJq2fqvnT1LVt7nveSYpsq0xxtTs=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com;
 s=zohoarc;
	t=1590087342;
 h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To;
	bh=YhIgy1Dlz/IC/nTk/8RAl7lz0vQug70v0BzMwBv+YcY=;
	b=ebdp/KU1bfSCE7hFPay+98MsCf7tBECKxG4wd0m80I4/Wd0KbhBlOMtM0fxpf+W/+YWcfM0bIUciuUW64HGnwHTcOxRLDa5Yh0/HSlfFMB4oBbJlFmsEC1eJ+yH89QKmbwUFi3zye36rkxYhtETPvErhixX55KN1OhyyHIL0lkk=
ARC-Authentication-Results: i=1; mx.zohomail.com;
	spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as
 permitted sender)
  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by
 mx.zohomail.com
	with SMTPS id 1590087342058819.1363727768228;
 Thu, 21 May 2020 11:55:42 -0700 (PDT)
Received: from localhost ([::1]:42162 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1jbqM0-0002Ji-G1
	for importer@patchew.org; Thu, 21 May 2020 14:55:40 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:35252)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <cfontana@suse.de>) id 1jbqKd-0000v2-MK
 for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:15 -0400
Received: from mx2.suse.de ([195.135.220.15]:49886)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <cfontana@suse.de>) id 1jbqKb-00040i-VX
 for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:15 -0400
Received: from relay2.suse.de (unknown [195.135.220.254])
 by mx2.suse.de (Postfix) with ESMTP id B8C4DAFF8;
 Thu, 21 May 2020 18:54:14 +0000 (UTC)
X-Virus-Scanned: by amavisd-new at test-mx.suse.de
From: Claudio Fontana <cfontana@suse.de>
To: Paolo Bonzini <pbonzini@redhat.com>,
 =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
 Peter Maydell <peter.maydell@linaro.org>,
 =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= <philmd@redhat.com>
Subject: [RFC 1/3] cpu-throttle: new module, extracted from cpus.c
Date: Thu, 21 May 2020 20:54:05 +0200
Message-Id: <20200521185407.25311-2-cfontana@suse.de>
X-Mailer: git-send-email 2.16.4
In-Reply-To: <20200521185407.25311-1-cfontana@suse.de>
References: <20200521185407.25311-1-cfontana@suse.de>
Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17
 as permitted sender) client-ip=209.51.188.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Received-SPF: pass client-ip=195.135.220.15; envelope-from=cfontana@suse.de;
 helo=mx2.suse.de
X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/21 02:02:44
X-ACL-Warn: Detected OS   = Linux 2.2.x-3.x (no timestamps) [generic]
X-Spam_score_int: -41
X-Spam_score: -4.2
X-Spam_bar: ----
X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3,
 RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: Laurent Vivier <lvivier@redhat.com>, Thomas Huth <thuth@redhat.com>,
 Eduardo Habkost <ehabkost@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com>,
 "open list:All patches CC here" <qemu-devel@nongnu.org>,
 Roman Bolshakov <r.bolshakov@yadro.com>,
 Wenchao Wang <wenchao.wang@intel.com>,
 Colin Xu <colin.xu@intel.com>, Claudio Fontana <cfontana@suse.de>,
 "open list:X86 HAXM CPUs" <haxm-team@intel.com>,
 Sunil Muthuswamy <sunilmut@microsoft.com>,
 Richard Henderson <rth@twiddle.net>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

this is a first step in the refactoring of cpus.c.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
---
 MAINTAINERS                   |   1 +
 Makefile.target               |   8 ++-
 cpu-throttle.c                | 122 ++++++++++++++++++++++++++++++++++++++=
++++
 cpus.c                        |  95 +++-----------------------------
 include/hw/core/cpu.h         |  37 -------------
 include/qemu/main-loop.h      |   5 ++
 include/sysemu/cpu-throttle.h |  50 +++++++++++++++++
 migration/migration.c         |   1 +
 migration/ram.c               |   1 +
 9 files changed, 195 insertions(+), 125 deletions(-)
 create mode 100644 cpu-throttle.c
 create mode 100644 include/sysemu/cpu-throttle.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 87a412c229..35864a275a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2142,6 +2142,7 @@ Main loop
 M: Paolo Bonzini <pbonzini@redhat.com>
 S: Maintained
 F: cpus.c
+F: cpu-throttle.c
 F: include/qemu/main-loop.h
 F: include/sysemu/runstate.h
 F: util/main-loop.c
diff --git a/Makefile.target b/Makefile.target
index 8ed1eba95b..60cfa2a78b 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -152,7 +152,13 @@ endif #CONFIG_BSD_USER
 #########################################################
 # System emulator target
 ifdef CONFIG_SOFTMMU
-obj-y +=3D arch_init.o cpus.o gdbstub.o balloon.o ioport.o
+obj-y +=3D arch_init.o
+obj-y +=3D cpus.o
+obj-y +=3D cpu-throttle.o
+obj-y +=3D gdbstub.o
+obj-y +=3D balloon.o
+obj-y +=3D ioport.o
+
 obj-y +=3D qtest.o
 obj-y +=3D dump/
 obj-y +=3D hw/
diff --git a/cpu-throttle.c b/cpu-throttle.c
new file mode 100644
index 0000000000..4e6b2818ca
--- /dev/null
+++ b/cpu-throttle.c
@@ -0,0 +1,122 @@
+/*
+ * QEMU System Emulator
+ *
+ * Copyright (c) 2003-2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a=
 copy
+ * of this software and associated documentation files (the "Software"), t=
o deal
+ * in the Software without restriction, including without limitation the r=
ights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or se=
ll
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included=
 in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS=
 OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OT=
HER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING=
 FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS =
IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qemu/thread.h"
+#include "hw/core/cpu.h"
+#include "qemu/main-loop.h"
+#include "sysemu/cpus.h"
+#include "sysemu/cpu-throttle.h"
+
+/* vcpu throttling controls */
+static QEMUTimer *throttle_timer;
+static unsigned int throttle_percentage;
+
+#define CPU_THROTTLE_PCT_MIN 1
+#define CPU_THROTTLE_PCT_MAX 99
+#define CPU_THROTTLE_TIMESLICE_NS 10000000
+
+static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
+{
+    double pct;
+    double throttle_ratio;
+    int64_t sleeptime_ns, endtime_ns;
+
+    if (!cpu_throttle_get_percentage()) {
+        return;
+    }
+
+    pct =3D (double)cpu_throttle_get_percentage() / 100;
+    throttle_ratio =3D pct / (1 - pct);
+    /* Add 1ns to fix double's rounding error (like 0.9999999...) */
+    sleeptime_ns =3D (int64_t)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS =
+ 1);
+    endtime_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + sleeptime_ns;
+    while (sleeptime_ns > 0 && !cpu->stop) {
+        if (sleeptime_ns > SCALE_MS) {
+            qemu_cond_timedwait_iothread(cpu->halt_cond,
+                                         sleeptime_ns / SCALE_MS);
+        } else {
+            qemu_mutex_unlock_iothread();
+            g_usleep(sleeptime_ns / SCALE_US);
+            qemu_mutex_lock_iothread();
+        }
+        sleeptime_ns =3D endtime_ns - qemu_clock_get_ns(QEMU_CLOCK_REALTIM=
E);
+    }
+    atomic_set(&cpu->throttle_thread_scheduled, 0);
+}
+
+static void cpu_throttle_timer_tick(void *opaque)
+{
+    CPUState *cpu;
+    double pct;
+
+    /* Stop the timer if needed */
+    if (!cpu_throttle_get_percentage()) {
+        return;
+    }
+    CPU_FOREACH(cpu) {
+        if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) {
+            async_run_on_cpu(cpu, cpu_throttle_thread,
+                             RUN_ON_CPU_NULL);
+        }
+    }
+
+    pct =3D (double)cpu_throttle_get_percentage() / 100;
+    timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) +
+                                   CPU_THROTTLE_TIMESLICE_NS / (1 - pct));
+}
+
+void cpu_throttle_set(int new_throttle_pct)
+{
+    /* Ensure throttle percentage is within valid range */
+    new_throttle_pct =3D MIN(new_throttle_pct, CPU_THROTTLE_PCT_MAX);
+    new_throttle_pct =3D MAX(new_throttle_pct, CPU_THROTTLE_PCT_MIN);
+
+    atomic_set(&throttle_percentage, new_throttle_pct);
+
+    timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) +
+                                       CPU_THROTTLE_TIMESLICE_NS);
+}
+
+void cpu_throttle_stop(void)
+{
+    atomic_set(&throttle_percentage, 0);
+}
+
+bool cpu_throttle_active(void)
+{
+    return (cpu_throttle_get_percentage() !=3D 0);
+}
+
+int cpu_throttle_get_percentage(void)
+{
+    return atomic_read(&throttle_percentage);
+}
+
+void cpu_throttle_init(void)
+{
+    throttle_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
+                                  cpu_throttle_timer_tick, NULL);
+}
diff --git a/cpus.c b/cpus.c
index 5670c96bcf..3a46a4fc2b 100644
--- a/cpus.c
+++ b/cpus.c
@@ -61,6 +61,8 @@
 #include "hw/boards.h"
 #include "hw/hw.h"
=20
+#include "sysemu/cpu-throttle.h"
+
 #ifdef CONFIG_LINUX
=20
 #include <sys/prctl.h>
@@ -84,14 +86,6 @@ static QemuMutex qemu_global_mutex;
 int64_t max_delay;
 int64_t max_advance;
=20
-/* vcpu throttling controls */
-static QEMUTimer *throttle_timer;
-static unsigned int throttle_percentage;
-
-#define CPU_THROTTLE_PCT_MIN 1
-#define CPU_THROTTLE_PCT_MAX 99
-#define CPU_THROTTLE_TIMESLICE_NS 10000000
-
 bool cpu_is_stopped(CPUState *cpu)
 {
     return cpu->stopped || !runstate_is_running();
@@ -710,90 +704,12 @@ static const VMStateDescription vmstate_timers =3D {
     }
 };
=20
-static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
-{
-    double pct;
-    double throttle_ratio;
-    int64_t sleeptime_ns, endtime_ns;
-
-    if (!cpu_throttle_get_percentage()) {
-        return;
-    }
-
-    pct =3D (double)cpu_throttle_get_percentage()/100;
-    throttle_ratio =3D pct / (1 - pct);
-    /* Add 1ns to fix double's rounding error (like 0.9999999...) */
-    sleeptime_ns =3D (int64_t)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS =
+ 1);
-    endtime_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + sleeptime_ns;
-    while (sleeptime_ns > 0 && !cpu->stop) {
-        if (sleeptime_ns > SCALE_MS) {
-            qemu_cond_timedwait(cpu->halt_cond, &qemu_global_mutex,
-                                sleeptime_ns / SCALE_MS);
-        } else {
-            qemu_mutex_unlock_iothread();
-            g_usleep(sleeptime_ns / SCALE_US);
-            qemu_mutex_lock_iothread();
-        }
-        sleeptime_ns =3D endtime_ns - qemu_clock_get_ns(QEMU_CLOCK_REALTIM=
E);
-    }
-    atomic_set(&cpu->throttle_thread_scheduled, 0);
-}
-
-static void cpu_throttle_timer_tick(void *opaque)
-{
-    CPUState *cpu;
-    double pct;
-
-    /* Stop the timer if needed */
-    if (!cpu_throttle_get_percentage()) {
-        return;
-    }
-    CPU_FOREACH(cpu) {
-        if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) {
-            async_run_on_cpu(cpu, cpu_throttle_thread,
-                             RUN_ON_CPU_NULL);
-        }
-    }
-
-    pct =3D (double)cpu_throttle_get_percentage()/100;
-    timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) +
-                                   CPU_THROTTLE_TIMESLICE_NS / (1-pct));
-}
-
-void cpu_throttle_set(int new_throttle_pct)
-{
-    /* Ensure throttle percentage is within valid range */
-    new_throttle_pct =3D MIN(new_throttle_pct, CPU_THROTTLE_PCT_MAX);
-    new_throttle_pct =3D MAX(new_throttle_pct, CPU_THROTTLE_PCT_MIN);
-
-    atomic_set(&throttle_percentage, new_throttle_pct);
-
-    timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) +
-                                       CPU_THROTTLE_TIMESLICE_NS);
-}
-
-void cpu_throttle_stop(void)
-{
-    atomic_set(&throttle_percentage, 0);
-}
-
-bool cpu_throttle_active(void)
-{
-    return (cpu_throttle_get_percentage() !=3D 0);
-}
-
-int cpu_throttle_get_percentage(void)
-{
-    return atomic_read(&throttle_percentage);
-}
-
 void cpu_ticks_init(void)
 {
     seqlock_init(&timers_state.vm_clock_seqlock);
     qemu_spin_init(&timers_state.vm_clock_lock);
     vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
-    throttle_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
-                                           cpu_throttle_timer_tick, NULL);
+    cpu_throttle_init();
 }
=20
 void configure_icount(QemuOpts *opts, Error **errp)
@@ -1852,6 +1768,11 @@ void qemu_cond_wait_iothread(QemuCond *cond)
     qemu_cond_wait(cond, &qemu_global_mutex);
 }
=20
+void qemu_cond_timedwait_iothread(QemuCond *cond, int ms)
+{
+    qemu_cond_timedwait(cond, &qemu_global_mutex, ms);
+}
+
 static bool all_vcpus_paused(void)
 {
     CPUState *cpu;
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 07f7698155..e6b75d456c 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -817,43 +817,6 @@ bool cpu_exists(int64_t id);
  */
 CPUState *cpu_by_arch_id(int64_t id);
=20
-/**
- * cpu_throttle_set:
- * @new_throttle_pct: Percent of sleep time. Valid range is 1 to 99.
- *
- * Throttles all vcpus by forcing them to sleep for the given percentage of
- * time. A throttle_percentage of 25 corresponds to a 75% duty cycle rough=
ly.
- * (example: 10ms sleep for every 30ms awake).
- *
- * cpu_throttle_set can be called as needed to adjust new_throttle_pct.
- * Once the throttling starts, it will remain in effect until cpu_throttle=
_stop
- * is called.
- */
-void cpu_throttle_set(int new_throttle_pct);
-
-/**
- * cpu_throttle_stop:
- *
- * Stops the vcpu throttling started by cpu_throttle_set.
- */
-void cpu_throttle_stop(void);
-
-/**
- * cpu_throttle_active:
- *
- * Returns: %true if the vcpus are currently being throttled, %false other=
wise.
- */
-bool cpu_throttle_active(void);
-
-/**
- * cpu_throttle_get_percentage:
- *
- * Returns the vcpu throttle percentage. See cpu_throttle_set for details.
- *
- * Returns: The throttle percentage in range 1 to 99.
- */
-int cpu_throttle_get_percentage(void);
-
 #ifndef CONFIG_USER_ONLY
=20
 typedef void (*CPUInterruptHandler)(CPUState *, int);
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index a6d20b0719..2fa3d90ad6 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -263,6 +263,11 @@ int qemu_add_child_watch(pid_t pid);
  */
 bool qemu_mutex_iothread_locked(void);
=20
+/*
+ * qemu_cond_timedwait_iothread: like the previous, but with timeout
+ */
+void qemu_cond_timedwait_iothread(QemuCond *cond, int ms);
+
 /**
  * qemu_mutex_lock_iothread: Lock the main loop mutex.
  *
diff --git a/include/sysemu/cpu-throttle.h b/include/sysemu/cpu-throttle.h
new file mode 100644
index 0000000000..22356502a5
--- /dev/null
+++ b/include/sysemu/cpu-throttle.h
@@ -0,0 +1,50 @@
+#ifndef SYSEMU_CPU_THROTTLE_H
+#define SYSEMU_CPU_THROTTLE_H
+
+#include "qemu/timer.h"
+
+/**
+ * cpu_throttle_init:
+ *
+ * Initialize the CPU throttling API.
+ */
+void cpu_throttle_init(void);
+
+/**
+ * cpu_throttle_set:
+ * @new_throttle_pct: Percent of sleep time. Valid range is 1 to 99.
+ *
+ * Throttles all vcpus by forcing them to sleep for the given percentage of
+ * time. A throttle_percentage of 25 corresponds to a 75% duty cycle rough=
ly.
+ * (example: 10ms sleep for every 30ms awake).
+ *
+ * cpu_throttle_set can be called as needed to adjust new_throttle_pct.
+ * Once the throttling starts, it will remain in effect until cpu_throttle=
_stop
+ * is called.
+ */
+void cpu_throttle_set(int new_throttle_pct);
+
+/**
+ * cpu_throttle_stop:
+ *
+ * Stops the vcpu throttling started by cpu_throttle_set.
+ */
+void cpu_throttle_stop(void);
+
+/**
+ * cpu_throttle_active:
+ *
+ * Returns: %true if the vcpus are currently being throttled, %false other=
wise.
+ */
+bool cpu_throttle_active(void);
+
+/**
+ * cpu_throttle_get_percentage:
+ *
+ * Returns the vcpu throttle percentage. See cpu_throttle_set for details.
+ *
+ * Returns: The throttle percentage in range 1 to 99.
+ */
+int cpu_throttle_get_percentage(void);
+
+#endif /* SYSEMU_CPU_THROTTLE_H */
diff --git a/migration/migration.c b/migration/migration.c
index 0bb042a0f7..dd18323f32 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -23,6 +23,7 @@
 #include "socket.h"
 #include "sysemu/runstate.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/cpu-throttle.h"
 #include "rdma.h"
 #include "ram.h"
 #include "migration/global_state.h"
diff --git a/migration/ram.c b/migration/ram.c
index 859f835f1a..527f0c7316 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -52,6 +52,7 @@
 #include "migration/colo.h"
 #include "block.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/cpu-throttle.h"
 #include "savevm.h"
 #include "qemu/iov.h"
 #include "multifd.h"
--=20
2.16.4


From nobody Mon Feb  9 04:55:47 2026
Delivered-To: importer@patchew.org
Authentication-Results: mx.zohomail.com;
	spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as
 permitted sender)
  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
ARC-Seal: i=1; a=rsa-sha256; t=1590087479; cv=none;
	d=zohomail.com; s=zohoarc;
	b=XBLeEFL0mXwidKyrDgG1k0AN27hnrcMiXDpRrCO1t9TXJMuVBHkW7GJHxyl8f2mJxTTGO1S1EIXoRjQ6YQhRA3WJYo7lXFW9Cu1yRsZ5ikNAczgIFPOK+WZdmOFdE5H+mwDv8T9005w9w5zj7DE/0dJljO+VoLBup24XYZ1SmnU=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com;
 s=zohoarc;
	t=1590087479;
 h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To;
	bh=FjO1AI1p6eVoeErlPRsLi80I804mR21m8ae4DI+Zrn0=;
	b=JBbdM0q9NvgEjruysrDBJRD4tqmFEJy8EYt2YY8YcfxbKyy3wMhahI7MW70PZ8hpnDaVbwbyyTfwdxX6aiIzVvXTPZoNr7r+8bfWq5tBNFg7NXgBY/2nQykdfzI3i3Mh/KEEqwR3HqggFv3YslujtB+8hKPVBIeq+KLq0ysaKC8=
ARC-Authentication-Results: i=1; mx.zohomail.com;
	spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as
 permitted sender)
  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by
 mx.zohomail.com
	with SMTPS id 1590087479333785.8734410125821;
 Thu, 21 May 2020 11:57:59 -0700 (PDT)
Received: from localhost ([::1]:49566 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1jbqOD-0005zF-M0
	for importer@patchew.org; Thu, 21 May 2020 14:57:57 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:35320)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <cfontana@suse.de>) id 1jbqKj-00017N-2e
 for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:21 -0400
Received: from mx2.suse.de ([195.135.220.15]:49916)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <cfontana@suse.de>) id 1jbqKd-00041k-4k
 for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:20 -0400
Received: from relay2.suse.de (unknown [195.135.220.254])
 by mx2.suse.de (Postfix) with ESMTP id 51FDCB019;
 Thu, 21 May 2020 18:54:15 +0000 (UTC)
X-Virus-Scanned: by amavisd-new at test-mx.suse.de
From: Claudio Fontana <cfontana@suse.de>
To: Paolo Bonzini <pbonzini@redhat.com>,
 =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
 Peter Maydell <peter.maydell@linaro.org>,
 =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= <philmd@redhat.com>
Subject: [RFC 2/3] cpu-timers: new module extracted from cpus.c
Date: Thu, 21 May 2020 20:54:06 +0200
Message-Id: <20200521185407.25311-3-cfontana@suse.de>
X-Mailer: git-send-email 2.16.4
In-Reply-To: <20200521185407.25311-1-cfontana@suse.de>
References: <20200521185407.25311-1-cfontana@suse.de>
Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17
 as permitted sender) client-ip=209.51.188.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Received-SPF: pass client-ip=195.135.220.15; envelope-from=cfontana@suse.de;
 helo=mx2.suse.de
X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/21 02:02:44
X-ACL-Warn: Detected OS   = Linux 2.2.x-3.x (no timestamps) [generic]
X-Spam_score_int: -41
X-Spam_score: -4.2
X-Spam_bar: ----
X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3,
 RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: Laurent Vivier <lvivier@redhat.com>, Thomas Huth <thuth@redhat.com>,
 Eduardo Habkost <ehabkost@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com>,
 "open list:All patches CC here" <qemu-devel@nongnu.org>,
 Roman Bolshakov <r.bolshakov@yadro.com>,
 Wenchao Wang <wenchao.wang@intel.com>,
 Colin Xu <colin.xu@intel.com>, Claudio Fontana <cfontana@suse.de>,
 "open list:X86 HAXM CPUs" <haxm-team@intel.com>,
 Sunil Muthuswamy <sunilmut@microsoft.com>,
 Richard Henderson <rth@twiddle.net>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Signed-off-by: Claudio Fontana <cfontana@suse.de>
---
 MAINTAINERS                 |   1 +
 Makefile.target             |   1 +
 accel/qtest.c               |   3 +-
 accel/tcg/cpu-exec.c        |  43 ++-
 accel/tcg/tcg-all.c         |   7 +-
 accel/tcg/translate-all.c   |   3 +-
 cpu-timers.c                | 776 ++++++++++++++++++++++++++++++++++++++++=
++++
 cpus.c                      | 731 +----------------------------------------
 docs/replay.txt             |   6 +-
 exec.c                      |   4 -
 hw/core/ptimer.c            |   6 +-
 hw/i386/x86.c               |   1 +
 include/exec/cpu-all.h      |   4 +
 include/exec/exec-all.h     |   4 +-
 include/qemu/timer.h        |  20 --
 include/sysemu/cpu-timers.h |  73 +++++
 include/sysemu/cpus.h       |  12 +-
 include/sysemu/replay.h     |   4 +-
 qtest.c                     |   2 +-
 replay/replay.c             |   6 +-
 softmmu/vl.c                |   8 +-
 stubs/clock-warp.c          |   4 +-
 stubs/cpu-get-clock.c       |   2 +-
 stubs/cpu-get-icount.c      |  14 +-
 target/alpha/translate.c    |   3 +-
 target/arm/helper.c         |   7 +-
 target/riscv/csr.c          |   8 +-
 tests/ptimer-test-stubs.c   |   6 +
 tests/test-timed-average.c  |   2 +-
 util/main-loop.c            |   4 +-
 util/qemu-timer.c           |   9 +-
 31 files changed, 965 insertions(+), 809 deletions(-)
 create mode 100644 cpu-timers.c
 create mode 100644 include/sysemu/cpu-timers.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 35864a275a..1b3b17fda8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2143,6 +2143,7 @@ M: Paolo Bonzini <pbonzini@redhat.com>
 S: Maintained
 F: cpus.c
 F: cpu-throttle.c
+F: cpu-timers.c
 F: include/qemu/main-loop.h
 F: include/sysemu/runstate.h
 F: util/main-loop.c
diff --git a/Makefile.target b/Makefile.target
index 60cfa2a78b..1d40237375 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -155,6 +155,7 @@ ifdef CONFIG_SOFTMMU
 obj-y +=3D arch_init.o
 obj-y +=3D cpus.o
 obj-y +=3D cpu-throttle.o
+obj-y +=3D cpu-timers.o
 obj-y +=3D gdbstub.o
 obj-y +=3D balloon.o
 obj-y +=3D ioport.o
diff --git a/accel/qtest.c b/accel/qtest.c
index 5b88f55921..ef9ee0941a 100644
--- a/accel/qtest.c
+++ b/accel/qtest.c
@@ -19,13 +19,14 @@
 #include "sysemu/accel.h"
 #include "sysemu/qtest.h"
 #include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
=20
 static int qtest_init_accel(MachineState *ms)
 {
     QemuOpts *opts =3D qemu_opts_create(qemu_find_opts("icount"), NULL, 0,
                                       &error_abort);
     qemu_opt_set(opts, "shift", "0", &error_abort);
-    configure_icount(opts, &error_abort);
+    icount_configure(opts, &error_abort);
     qemu_opts_del(opts);
     return 0;
 }
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index d95c4848a4..82155c1db3 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -19,6 +19,7 @@
=20
 #include "qemu/osdep.h"
 #include "qemu-common.h"
+#include "qemu/qemu-print.h"
 #include "cpu.h"
 #include "trace.h"
 #include "disas/disas.h"
@@ -36,6 +37,8 @@
 #include "hw/i386/apic.h"
 #endif
 #include "sysemu/cpus.h"
+#include "exec/cpu-all.h"
+#include "sysemu/cpu-timers.h"
 #include "sysemu/replay.h"
=20
 /* -icount align implementation. */
@@ -56,6 +59,9 @@ typedef struct SyncClocks {
 #define MAX_DELAY_PRINT_RATE 2000000000LL
 #define MAX_NB_PRINTS 100
=20
+static int64_t max_delay;
+static int64_t max_advance;
+
 static void align_clocks(SyncClocks *sc, CPUState *cpu)
 {
     int64_t cpu_icount;
@@ -65,7 +71,7 @@ static void align_clocks(SyncClocks *sc, CPUState *cpu)
     }
=20
     cpu_icount =3D cpu->icount_extra + cpu_neg(cpu)->icount_decr.u16.low;
-    sc->diff_clk +=3D cpu_icount_to_ns(sc->last_cpu_icount - cpu_icount);
+    sc->diff_clk +=3D icount_to_ns(sc->last_cpu_icount - cpu_icount);
     sc->last_cpu_icount =3D cpu_icount;
=20
     if (sc->diff_clk > VM_CLOCK_ADVANCE) {
@@ -98,9 +104,9 @@ static void print_delay(const SyncClocks *sc)
             (-sc->diff_clk / (float)1000000000LL <
              (threshold_delay - THRESHOLD_REDUCE))) {
             threshold_delay =3D (-sc->diff_clk / 1000000000LL) + 1;
-            printf("Warning: The guest is now late by %.1f to %.1f seconds=
\n",
-                   threshold_delay - 1,
-                   threshold_delay);
+            qemu_printf("Warning: The guest is now late by %.1f to %.1f se=
conds\n",
+                        threshold_delay - 1,
+                        threshold_delay);
             nb_prints++;
             last_realtime_clock =3D sc->realtime_clock;
         }
@@ -597,7 +603,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
=20
     /* Finally, check if we need to exit to the main loop.  */
     if (unlikely(atomic_read(&cpu->exit_request))
-        || (use_icount
+        || (icount_enabled()
             && cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra =3D=
=3D 0)) {
         atomic_set(&cpu->exit_request, 0);
         if (cpu->exception_index =3D=3D -1) {
@@ -638,10 +644,10 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, Tr=
anslationBlock *tb,
     }
=20
     /* Instruction counter expired.  */
-    assert(use_icount);
+    assert(icount_enabled());
 #ifndef CONFIG_USER_ONLY
     /* Ensure global icount has gone forward */
-    cpu_update_icount(cpu);
+    icount_update(cpu);
     /* Refill decrementer and continue execution.  */
     insns_left =3D MIN(0xffff, cpu->icount_budget);
     cpu_neg(cpu)->icount_decr.u16.low =3D insns_left;
@@ -741,3 +747,26 @@ int cpu_exec(CPUState *cpu)
=20
     return ret;
 }
+
+#ifndef CONFIG_USER_ONLY
+
+void dump_drift_info(void)
+{
+    if (!icount_enabled()) {
+        return;
+    }
+
+    qemu_printf("Host - Guest clock  %"PRIi64" ms\n",
+                (cpu_get_clock() - icount_get()) / SCALE_MS);
+    if (icount_align_option) {
+        qemu_printf("Max guest delay     %"PRIi64" ms\n",
+                    -max_delay / SCALE_MS);
+        qemu_printf("Max guest advance   %"PRIi64" ms\n",
+                    max_advance / SCALE_MS);
+    } else {
+        qemu_printf("Max guest delay     NA\n");
+        qemu_printf("Max guest advance   NA\n");
+    }
+}
+
+#endif /* !CONFIG_USER_ONLY */
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 3b4fda5640..e27385d051 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -29,6 +29,7 @@
 #include "qom/object.h"
 #include "cpu.h"
 #include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
 #include "qemu/main-loop.h"
 #include "tcg/tcg.h"
 #include "qapi/error.h"
@@ -65,7 +66,7 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
         qemu_cpu_kick(cpu);
     } else {
         atomic_set(&cpu_neg(cpu)->icount_decr.u16.high, -1);
-        if (use_icount &&
+        if (icount_enabled() &&
             !cpu->can_do_io
             && (mask & ~old_mask) !=3D 0) {
             cpu_abort(cpu, "Raised interrupt while not in I/O function");
@@ -104,7 +105,7 @@ static bool check_tcg_memory_orders_compatible(void)
=20
 static bool default_mttcg_enabled(void)
 {
-    if (use_icount || TCG_OVERSIZED_GUEST) {
+    if (icount_enabled() || TCG_OVERSIZED_GUEST) {
         return false;
     } else {
 #ifdef TARGET_SUPPORTS_MTTCG
@@ -146,7 +147,7 @@ static void tcg_set_thread(Object *obj, const char *val=
ue, Error **errp)
     if (strcmp(value, "multi") =3D=3D 0) {
         if (TCG_OVERSIZED_GUEST) {
             error_setg(errp, "No MTTCG when guest word size > hosts");
-        } else if (use_icount) {
+        } else if (icount_enabled()) {
             error_setg(errp, "No MTTCG when icount is enabled");
         } else {
 #ifndef TARGET_SUPPORTS_MTTCG
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 42ce1dfcff..479edeb2ea 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -57,6 +57,7 @@
 #include "qemu/main-loop.h"
 #include "exec/log.h"
 #include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
 #include "sysemu/tcg.h"
=20
 /* #define DEBUG_TB_INVALIDATE */
@@ -369,7 +370,7 @@ static int cpu_restore_state_from_tb(CPUState *cpu, Tra=
nslationBlock *tb,
=20
  found:
     if (reset_icount && (tb_cflags(tb) & CF_USE_ICOUNT)) {
-        assert(use_icount);
+        assert(icount_enabled());
         /* Reset the cycle counter to the start of the block
            and shift if to the number of actually executed instructions */
         cpu_neg(cpu)->icount_decr.u16.low +=3D num_insns - i;
diff --git a/cpu-timers.c b/cpu-timers.c
new file mode 100644
index 0000000000..20fea07625
--- /dev/null
+++ b/cpu-timers.c
@@ -0,0 +1,776 @@
+/*
+ * QEMU System Emulator
+ *
+ * Copyright (c) 2003-2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a=
 copy
+ * of this software and associated documentation files (the "Software"), t=
o deal
+ * in the Software without restriction, including without limitation the r=
ights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or se=
ll
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included=
 in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS=
 OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OT=
HER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING=
 FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS =
IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qemu/cutils.h"
+#include "migration/vmstate.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "exec/exec-all.h"
+#include "sysemu/cpus.h"
+#include "sysemu/qtest.h"
+#include "qemu/main-loop.h"
+#include "qemu/option.h"
+#include "qemu/seqlock.h"
+#include "sysemu/replay.h"
+#include "sysemu/runstate.h"
+#include "hw/core/cpu.h"
+#include "sysemu/cpu-timers.h"
+#include "sysemu/cpu-throttle.h"
+
+typedef struct TimersState {
+    /* Protected by BQL.  */
+    int64_t cpu_ticks_prev;
+    int64_t cpu_ticks_offset;
+
+    /*
+     * Protect fields that can be respectively read outside the
+     * BQL, and written from multiple threads.
+     */
+    QemuSeqLock vm_clock_seqlock;
+    QemuSpin vm_clock_lock;
+
+    int16_t cpu_ticks_enabled;
+
+    /* Conversion factor from emulated instructions to virtual clock ticks=
.  */
+    int16_t icount_time_shift;
+
+    /* Compensate for varying guest execution speed.  */
+    int64_t qemu_icount_bias;
+
+    int64_t vm_clock_warp_start;
+    int64_t cpu_clock_offset;
+
+    /* Only written by TCG thread */
+    int64_t qemu_icount;
+
+    /* for adjusting icount */
+    QEMUTimer *icount_rt_timer;
+    QEMUTimer *icount_vm_timer;
+    QEMUTimer *icount_warp_timer;
+} TimersState;
+
+static TimersState timers_state;
+
+/*
+ * ICOUNT: Instruction Counter
+ */
+static bool icount_sleep =3D true;
+/* Arbitrarily pick 1MIPS as the minimum allowable speed.  */
+#define MAX_ICOUNT_SHIFT 10
+
+/*
+ * 0 =3D Do not count executed instructions.
+ * 1 =3D Fixed conversion of insn to ns via "shift" option
+ * 2 =3D Runtime adaptive algorithm to compute shift
+ */
+static int use_icount;
+
+int icount_enabled(void)
+{
+    return use_icount;
+}
+
+static void icount_enable_precise(void)
+{
+    use_icount =3D 1;
+}
+
+static void icount_enable_adaptive(void)
+{
+    use_icount =3D 2;
+}
+
+/*
+ * The current number of executed instructions is based on what we
+ * originally budgeted minus the current state of the decrementing
+ * icount counters in extra/u16.low.
+ */
+static int64_t icount_get_executed(CPUState *cpu)
+{
+    return (cpu->icount_budget -
+            (cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra));
+}
+
+/*
+ * Update the global shared timer_state.qemu_icount to take into
+ * account executed instructions. This is done by the TCG vCPU
+ * thread so the main-loop can see time has moved forward.
+ */
+static void icount_update_locked(CPUState *cpu)
+{
+    int64_t executed =3D icount_get_executed(cpu);
+    cpu->icount_budget -=3D executed;
+
+    atomic_set_i64(&timers_state.qemu_icount,
+                   timers_state.qemu_icount + executed);
+}
+
+/*
+ * Update the global shared timer_state.qemu_icount to take into
+ * account executed instructions. This is done by the TCG vCPU
+ * thread so the main-loop can see time has moved forward.
+ */
+void icount_update(CPUState *cpu)
+{
+    seqlock_write_lock(&timers_state.vm_clock_seqlock,
+                       &timers_state.vm_clock_lock);
+    icount_update_locked(cpu);
+    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
+                         &timers_state.vm_clock_lock);
+}
+
+static int64_t icount_get_raw_locked(void)
+{
+    CPUState *cpu =3D current_cpu;
+
+    if (cpu && cpu->running) {
+        if (!cpu->can_do_io) {
+            error_report("Bad icount read");
+            exit(1);
+        }
+        /* Take into account what has run */
+        icount_update_locked(cpu);
+    }
+    /* The read is protected by the seqlock, but needs atomic64 to avoid U=
B */
+    return atomic_read_i64(&timers_state.qemu_icount);
+}
+
+static int64_t icount_get_locked(void)
+{
+    int64_t icount =3D icount_get_raw_locked();
+    return atomic_read_i64(&timers_state.qemu_icount_bias) +
+        icount_to_ns(icount);
+}
+
+int64_t icount_get_raw(void)
+{
+    int64_t icount;
+    unsigned start;
+
+    do {
+        start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock);
+        icount =3D icount_get_raw_locked();
+    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
+
+    return icount;
+}
+
+/* Return the virtual CPU time, based on the instruction counter.  */
+int64_t icount_get(void)
+{
+    int64_t icount;
+    unsigned start;
+
+    do {
+        start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock);
+        icount =3D icount_get_locked();
+    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
+
+    return icount;
+}
+
+int64_t icount_to_ns(int64_t icount)
+{
+    return icount << atomic_read(&timers_state.icount_time_shift);
+}
+
+/*
+ * Correlation between real and virtual time is always going to be
+ * fairly approximate, so ignore small variation.
+ * When the guest is idle real and virtual time will be aligned in
+ * the IO wait loop.
+ */
+#define ICOUNT_WOBBLE (NANOSECONDS_PER_SECOND / 10)
+
+static int64_t cpu_get_clock_locked(void);
+
+static void icount_adjust(void)
+{
+    int64_t cur_time;
+    int64_t cur_icount;
+    int64_t delta;
+
+    /* Protected by TimersState mutex.  */
+    static int64_t last_delta;
+
+    /* If the VM is not running, then do nothing.  */
+    if (!runstate_is_running()) {
+        return;
+    }
+
+    seqlock_write_lock(&timers_state.vm_clock_seqlock,
+                       &timers_state.vm_clock_lock);
+    cur_time =3D cpu_get_clock_locked();
+    cur_icount =3D icount_get_locked();
+
+    delta =3D cur_icount - cur_time;
+    /* FIXME: This is a very crude algorithm, somewhat prone to oscillatio=
n.  */
+    if (delta > 0
+        && last_delta + ICOUNT_WOBBLE < delta * 2
+        && timers_state.icount_time_shift > 0) {
+        /* The guest is getting too far ahead.  Slow time down.  */
+        atomic_set(&timers_state.icount_time_shift,
+                   timers_state.icount_time_shift - 1);
+    }
+    if (delta < 0
+        && last_delta - ICOUNT_WOBBLE > delta * 2
+        && timers_state.icount_time_shift < MAX_ICOUNT_SHIFT) {
+        /* The guest is getting too far behind.  Speed time up.  */
+        atomic_set(&timers_state.icount_time_shift,
+                   timers_state.icount_time_shift + 1);
+    }
+    last_delta =3D delta;
+    atomic_set_i64(&timers_state.qemu_icount_bias,
+                   cur_icount - (timers_state.qemu_icount
+                                 << timers_state.icount_time_shift));
+    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
+                         &timers_state.vm_clock_lock);
+}
+
+static void icount_adjust_rt(void *opaque)
+{
+    timer_mod(timers_state.icount_rt_timer,
+              qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) + 1000);
+    icount_adjust();
+}
+
+static void icount_adjust_vm(void *opaque)
+{
+    timer_mod(timers_state.icount_vm_timer,
+                   qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
+                   NANOSECONDS_PER_SECOND / 10);
+    icount_adjust();
+}
+
+int64_t icount_round(int64_t count)
+{
+    int shift =3D atomic_read(&timers_state.icount_time_shift);
+    return (count + (1 << shift) - 1) >> shift;
+}
+
+static void icount_warp_rt(void)
+{
+    unsigned seq;
+    int64_t warp_start;
+
+    /*
+     * The icount_warp_timer is rescheduled soon after vm_clock_warp_start
+     * changes from -1 to another value, so the race here is okay.
+     */
+    do {
+        seq =3D seqlock_read_begin(&timers_state.vm_clock_seqlock);
+        warp_start =3D timers_state.vm_clock_warp_start;
+    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, seq));
+
+    if (warp_start =3D=3D -1) {
+        return;
+    }
+
+    seqlock_write_lock(&timers_state.vm_clock_seqlock,
+                       &timers_state.vm_clock_lock);
+    if (runstate_is_running()) {
+        int64_t clock =3D REPLAY_CLOCK_LOCKED(REPLAY_CLOCK_VIRTUAL_RT,
+                                            cpu_get_clock_locked());
+        int64_t warp_delta;
+
+        warp_delta =3D clock - timers_state.vm_clock_warp_start;
+        if (icount_enabled() =3D=3D 2) {
+            /*
+             * In adaptive mode, do not let QEMU_CLOCK_VIRTUAL run too
+             * far ahead of real time.
+             */
+            int64_t cur_icount =3D icount_get_locked();
+            int64_t delta =3D clock - cur_icount;
+            warp_delta =3D MIN(warp_delta, delta);
+        }
+        atomic_set_i64(&timers_state.qemu_icount_bias,
+                       timers_state.qemu_icount_bias + warp_delta);
+    }
+    timers_state.vm_clock_warp_start =3D -1;
+    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
+                       &timers_state.vm_clock_lock);
+
+    if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) {
+        qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+    }
+}
+
+static void icount_timer_cb(void *opaque)
+{
+    /*
+     * No need for a checkpoint because the timer already synchronizes
+     * with CHECKPOINT_CLOCK_VIRTUAL_RT.
+     */
+    icount_warp_rt();
+}
+
+void icount_start_warp_timer(void)
+{
+    int64_t clock;
+    int64_t deadline;
+
+    if (!icount_enabled()) {
+        return;
+    }
+
+    /*
+     * Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers
+     * do not fire, so computing the deadline does not make sense.
+     */
+    if (!runstate_is_running()) {
+        return;
+    }
+
+    if (replay_mode !=3D REPLAY_MODE_PLAY) {
+        if (!all_cpu_threads_idle()) {
+            return;
+        }
+
+        if (qtest_enabled()) {
+            /* When testing, qtest commands advance icount.  */
+            return;
+        }
+
+        replay_checkpoint(CHECKPOINT_CLOCK_WARP_START);
+    } else {
+        /* warp clock deterministically in record/replay mode */
+        if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_START)) {
+            /*
+             * vCPU is sleeping and warp can't be started.
+             * It is probably a race condition: notification sent
+             * to vCPU was processed in advance and vCPU went to sleep.
+             * Therefore we have to wake it up for doing someting.
+             */
+            if (replay_has_checkpoint()) {
+                qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+            }
+            return;
+        }
+    }
+
+    /* We want to use the earliest deadline from ALL vm_clocks */
+    clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT);
+    deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                          ~QEMU_TIMER_ATTR_EXTERNAL);
+    if (deadline < 0) {
+        static bool notified;
+        if (!icount_sleep && !notified) {
+            warn_report("icount sleep disabled and no active timers");
+            notified =3D true;
+        }
+        return;
+    }
+
+    if (deadline > 0) {
+        /*
+         * Ensure QEMU_CLOCK_VIRTUAL proceeds even when the virtual CPU go=
es to
+         * sleep.  Otherwise, the CPU might be waiting for a future timer
+         * interrupt to wake it up, but the interrupt never comes because
+         * the vCPU isn't running any insns and thus doesn't advance the
+         * QEMU_CLOCK_VIRTUAL.
+         */
+        if (!icount_sleep) {
+            /*
+             * We never let VCPUs sleep in no sleep icount mode.
+             * If there is a pending QEMU_CLOCK_VIRTUAL timer we just adva=
nce
+             * to the next QEMU_CLOCK_VIRTUAL event and notify it.
+             * It is useful when we want a deterministic execution time,
+             * isolated from host latencies.
+             */
+            seqlock_write_lock(&timers_state.vm_clock_seqlock,
+                               &timers_state.vm_clock_lock);
+            atomic_set_i64(&timers_state.qemu_icount_bias,
+                           timers_state.qemu_icount_bias + deadline);
+            seqlock_write_unlock(&timers_state.vm_clock_seqlock,
+                                 &timers_state.vm_clock_lock);
+            qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+        } else {
+            /*
+             * We do stop VCPUs and only advance QEMU_CLOCK_VIRTUAL after =
some
+             * "real" time, (related to the time left until the next event=
) has
+             * passed. The QEMU_CLOCK_VIRTUAL_RT clock will do this.
+             * This avoids that the warps are visible externally; for exam=
ple,
+             * you will not be sending network packets continuously instea=
d of
+             * every 100ms.
+             */
+            seqlock_write_lock(&timers_state.vm_clock_seqlock,
+                               &timers_state.vm_clock_lock);
+            if (timers_state.vm_clock_warp_start =3D=3D -1
+                || timers_state.vm_clock_warp_start > clock) {
+                timers_state.vm_clock_warp_start =3D clock;
+            }
+            seqlock_write_unlock(&timers_state.vm_clock_seqlock,
+                                 &timers_state.vm_clock_lock);
+            timer_mod_anticipate(timers_state.icount_warp_timer,
+                                 clock + deadline);
+        }
+    } else if (deadline =3D=3D 0) {
+        qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+    }
+}
+
+void icount_account_warp_timer(void)
+{
+    if (!use_icount || !icount_sleep) {
+        return;
+    }
+
+    /*
+     * Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers
+     * do not fire, so computing the deadline does not make sense.
+     */
+    if (!runstate_is_running()) {
+        return;
+    }
+
+    /* warp clock deterministically in record/replay mode */
+    if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_ACCOUNT)) {
+        return;
+    }
+
+    timer_del(timers_state.icount_warp_timer);
+    icount_warp_rt();
+}
+
+void icount_configure(QemuOpts *opts, Error **errp)
+{
+    const char *option =3D qemu_opt_get(opts, "shift");
+    bool sleep =3D qemu_opt_get_bool(opts, "sleep", true);
+    bool align =3D qemu_opt_get_bool(opts, "align", false);
+    long time_shift =3D -1;
+
+    if (!option && qemu_opt_get(opts, "align")) {
+        error_setg(errp, "Please specify shift option when using align");
+        return;
+    }
+
+    if (align && !sleep) {
+        error_setg(errp, "align=3Don and sleep=3Doff are incompatible");
+        return;
+    }
+
+    if (strcmp(option, "auto") !=3D 0) {
+        if (qemu_strtol(option, NULL, 0, &time_shift) < 0
+            || time_shift < 0 || time_shift > MAX_ICOUNT_SHIFT) {
+            error_setg(errp, "icount: Invalid shift value");
+            return;
+        }
+    } else if (icount_align_option) {
+        error_setg(errp, "shift=3Dauto and align=3Don are incompatible");
+        return;
+    } else if (!icount_sleep) {
+        error_setg(errp, "shift=3Dauto and sleep=3Doff are incompatible");
+        return;
+    }
+
+    icount_sleep =3D sleep;
+    if (icount_sleep) {
+        timers_state.icount_warp_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL=
_RT,
+                                         icount_timer_cb, NULL);
+    }
+
+    icount_align_option =3D align;
+
+    if (time_shift >=3D 0) {
+        timers_state.icount_time_shift =3D time_shift;
+        icount_enable_precise();
+        return;
+    }
+
+    icount_enable_adaptive();
+
+    /*
+     * 125MIPS seems a reasonable initial guess at the guest speed.
+     * It will be corrected fairly quickly anyway.
+     */
+    timers_state.icount_time_shift =3D 3;
+
+    /*
+     * Have both realtime and virtual time triggers for speed adjustment.
+     * The realtime trigger catches emulated time passing too slowly,
+     * the virtual time trigger catches emulated time passing too fast.
+     * Realtime triggers occur even when idle, so use them less frequently
+     * than VM triggers.
+     */
+    timers_state.vm_clock_warp_start =3D -1;
+    timers_state.icount_rt_timer =3D timer_new_ms(QEMU_CLOCK_VIRTUAL_RT,
+                                   icount_adjust_rt, NULL);
+    timer_mod(timers_state.icount_rt_timer,
+                   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) + 1000);
+    timers_state.icount_vm_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL,
+                                        icount_adjust_vm, NULL);
+    timer_mod(timers_state.icount_vm_timer,
+                   qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
+                   NANOSECONDS_PER_SECOND / 10);
+}
+
+/* clock and ticks */
+
+static int64_t cpu_get_ticks_locked(void)
+{
+    int64_t ticks =3D timers_state.cpu_ticks_offset;
+    if (timers_state.cpu_ticks_enabled) {
+        ticks +=3D cpu_get_host_ticks();
+    }
+
+    if (timers_state.cpu_ticks_prev > ticks) {
+        /* Non increasing ticks may happen if the host uses software suspe=
nd. */
+        timers_state.cpu_ticks_offset +=3D timers_state.cpu_ticks_prev - t=
icks;
+        ticks =3D timers_state.cpu_ticks_prev;
+    }
+
+    timers_state.cpu_ticks_prev =3D ticks;
+    return ticks;
+}
+
+/*
+ * return the time elapsed in VM between vm_start and vm_stop.  Unless
+ * icount is active, cpu_get_ticks() uses units of the host CPU cycle
+ * counter.
+ */
+int64_t cpu_get_ticks(void)
+{
+    int64_t ticks;
+
+    if (icount_enabled()) {
+        return icount_get();
+    }
+
+    qemu_spin_lock(&timers_state.vm_clock_lock);
+    ticks =3D cpu_get_ticks_locked();
+    qemu_spin_unlock(&timers_state.vm_clock_lock);
+    return ticks;
+}
+
+static int64_t cpu_get_clock_locked(void)
+{
+    int64_t time;
+
+    time =3D timers_state.cpu_clock_offset;
+    if (timers_state.cpu_ticks_enabled) {
+        time +=3D get_clock();
+    }
+
+    return time;
+}
+
+/*
+ * Return the monotonic time elapsed in VM, i.e.,
+ * the time between vm_start and vm_stop
+ */
+int64_t cpu_get_clock(void)
+{
+    int64_t ti;
+    unsigned start;
+
+    do {
+        start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock);
+        ti =3D cpu_get_clock_locked();
+    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
+
+    return ti;
+}
+
+/*
+ * enable cpu_get_ticks()
+ * Caller must hold BQL which serves as mutex for vm_clock_seqlock.
+ */
+void cpu_enable_ticks(void)
+{
+    seqlock_write_lock(&timers_state.vm_clock_seqlock,
+                       &timers_state.vm_clock_lock);
+    if (!timers_state.cpu_ticks_enabled) {
+        timers_state.cpu_ticks_offset -=3D cpu_get_host_ticks();
+        timers_state.cpu_clock_offset -=3D get_clock();
+        timers_state.cpu_ticks_enabled =3D 1;
+    }
+    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
+                       &timers_state.vm_clock_lock);
+}
+
+/*
+ * disable cpu_get_ticks() : the clock is stopped. You must not call
+ * cpu_get_ticks() after that.
+ * Caller must hold BQL which serves as mutex for vm_clock_seqlock.
+ */
+void cpu_disable_ticks(void)
+{
+    seqlock_write_lock(&timers_state.vm_clock_seqlock,
+                       &timers_state.vm_clock_lock);
+    if (timers_state.cpu_ticks_enabled) {
+        timers_state.cpu_ticks_offset +=3D cpu_get_host_ticks();
+        timers_state.cpu_clock_offset =3D cpu_get_clock_locked();
+        timers_state.cpu_ticks_enabled =3D 0;
+    }
+    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
+                         &timers_state.vm_clock_lock);
+}
+
+void qtest_clock_warp(int64_t dest)
+{
+    int64_t clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+    AioContext *aio_context;
+    assert(qtest_enabled());
+    aio_context =3D qemu_get_aio_context();
+    while (clock < dest) {
+        int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                                      QEMU_TIMER_ATTR_ALL);
+        int64_t warp =3D qemu_soonest_timeout(dest - clock, deadline);
+
+        seqlock_write_lock(&timers_state.vm_clock_seqlock,
+                           &timers_state.vm_clock_lock);
+        atomic_set_i64(&timers_state.qemu_icount_bias,
+                       timers_state.qemu_icount_bias + warp);
+        seqlock_write_unlock(&timers_state.vm_clock_seqlock,
+                             &timers_state.vm_clock_lock);
+
+        qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
+        timerlist_run_timers(aio_context->tlg.tl[QEMU_CLOCK_VIRTUAL]);
+        clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+    }
+    qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+}
+
+static bool icount_state_needed(void *opaque)
+{
+    return icount_enabled();
+}
+
+static bool warp_timer_state_needed(void *opaque)
+{
+    TimersState *s =3D opaque;
+    return s->icount_warp_timer !=3D NULL;
+}
+
+static bool adjust_timers_state_needed(void *opaque)
+{
+    TimersState *s =3D opaque;
+    return s->icount_rt_timer !=3D NULL;
+}
+
+/*
+ * Subsection for warp timer migration is optional, because may not be cre=
ated
+ */
+static const VMStateDescription icount_vmstate_warp_timer =3D {
+    .name =3D "timer/icount/warp_timer",
+    .version_id =3D 1,
+    .minimum_version_id =3D 1,
+    .needed =3D warp_timer_state_needed,
+    .fields =3D (VMStateField[]) {
+        VMSTATE_INT64(vm_clock_warp_start, TimersState),
+        VMSTATE_TIMER_PTR(icount_warp_timer, TimersState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static const VMStateDescription icount_vmstate_adjust_timers =3D {
+    .name =3D "timer/icount/timers",
+    .version_id =3D 1,
+    .minimum_version_id =3D 1,
+    .needed =3D adjust_timers_state_needed,
+    .fields =3D (VMStateField[]) {
+        VMSTATE_TIMER_PTR(icount_rt_timer, TimersState),
+        VMSTATE_TIMER_PTR(icount_vm_timer, TimersState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+/*
+ * This is a subsection for icount migration.
+ */
+static const VMStateDescription icount_vmstate_timers =3D {
+    .name =3D "timer/icount",
+    .version_id =3D 1,
+    .minimum_version_id =3D 1,
+    .needed =3D icount_state_needed,
+    .fields =3D (VMStateField[]) {
+        VMSTATE_INT64(qemu_icount_bias, TimersState),
+        VMSTATE_INT64(qemu_icount, TimersState),
+        VMSTATE_END_OF_LIST()
+    },
+    .subsections =3D (const VMStateDescription * []) {
+        &icount_vmstate_warp_timer,
+        &icount_vmstate_adjust_timers,
+        NULL
+    }
+};
+
+static const VMStateDescription vmstate_timers =3D {
+    .name =3D "timer",
+    .version_id =3D 2,
+    .minimum_version_id =3D 1,
+    .fields =3D (VMStateField[]) {
+        VMSTATE_INT64(cpu_ticks_offset, TimersState),
+        VMSTATE_UNUSED(8),
+        VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2),
+        VMSTATE_END_OF_LIST()
+    },
+    .subsections =3D (const VMStateDescription * []) {
+        &icount_vmstate_timers,
+        NULL
+    }
+};
+
+static void do_nothing(CPUState *cpu, run_on_cpu_data unused)
+{
+}
+
+void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
+{
+    if (!icount_enabled() || type !=3D QEMU_CLOCK_VIRTUAL) {
+        qemu_notify_event();
+        return;
+    }
+
+    if (qemu_in_vcpu_thread()) {
+        /*
+         * A CPU is currently running; kick it back out to the
+         * tcg_cpu_exec() loop so it will recalculate its
+         * icount deadline immediately.
+         */
+        qemu_cpu_kick(current_cpu);
+    } else if (first_cpu) {
+        /*
+         * qemu_cpu_kick is not enough to kick a halted CPU out of
+         * qemu_tcg_wait_io_event.  async_run_on_cpu, instead,
+         * causes cpu_thread_is_idle to return false.  This way,
+         * handle_icount_deadline can run.
+         * If we have no CPUs at all for some reason, we don't
+         * need to do anything.
+         */
+        async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL);
+    }
+}
+
+/* initialize this module and the cpu throttle for convenience as well */
+void cpu_timers_init(void)
+{
+    seqlock_init(&timers_state.vm_clock_seqlock);
+    qemu_spin_init(&timers_state.vm_clock_lock);
+    vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
+
+    cpu_throttle_init();
+}
diff --git a/cpus.c b/cpus.c
index 3a46a4fc2b..7e9f545be8 100644
--- a/cpus.c
+++ b/cpus.c
@@ -58,11 +58,10 @@
 #include "hw/nmi.h"
 #include "sysemu/replay.h"
 #include "sysemu/runstate.h"
+#include "sysemu/cpu-timers.h"
 #include "hw/boards.h"
 #include "hw/hw.h"
=20
-#include "sysemu/cpu-throttle.h"
-
 #ifdef CONFIG_LINUX
=20
 #include <sys/prctl.h>
@@ -83,9 +82,6 @@
=20
 static QemuMutex qemu_global_mutex;
=20
-int64_t max_delay;
-int64_t max_advance;
-
 bool cpu_is_stopped(CPUState *cpu)
 {
     return cpu->stopped || !runstate_is_running();
@@ -106,7 +102,7 @@ static bool cpu_thread_is_idle(CPUState *cpu)
     return true;
 }
=20
-static bool all_cpu_threads_idle(void)
+bool all_cpu_threads_idle(void)
 {
     CPUState *cpu;
=20
@@ -118,668 +114,8 @@ static bool all_cpu_threads_idle(void)
     return true;
 }
=20
-/***********************************************************/
-/* guest cycle counter */
-
-/* Protected by TimersState seqlock */
-
-static bool icount_sleep =3D true;
-/* Arbitrarily pick 1MIPS as the minimum allowable speed.  */
-#define MAX_ICOUNT_SHIFT 10
-
-typedef struct TimersState {
-    /* Protected by BQL.  */
-    int64_t cpu_ticks_prev;
-    int64_t cpu_ticks_offset;
-
-    /* Protect fields that can be respectively read outside the
-     * BQL, and written from multiple threads.
-     */
-    QemuSeqLock vm_clock_seqlock;
-    QemuSpin vm_clock_lock;
-
-    int16_t cpu_ticks_enabled;
-
-    /* Conversion factor from emulated instructions to virtual clock ticks=
.  */
-    int16_t icount_time_shift;
-
-    /* Compensate for varying guest execution speed.  */
-    int64_t qemu_icount_bias;
-
-    int64_t vm_clock_warp_start;
-    int64_t cpu_clock_offset;
-
-    /* Only written by TCG thread */
-    int64_t qemu_icount;
-
-    /* for adjusting icount */
-    QEMUTimer *icount_rt_timer;
-    QEMUTimer *icount_vm_timer;
-    QEMUTimer *icount_warp_timer;
-} TimersState;
-
-static TimersState timers_state;
 bool mttcg_enabled;
=20
-
-/* The current number of executed instructions is based on what we
- * originally budgeted minus the current state of the decrementing
- * icount counters in extra/u16.low.
- */
-static int64_t cpu_get_icount_executed(CPUState *cpu)
-{
-    return (cpu->icount_budget -
-            (cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra));
-}
-
-/*
- * Update the global shared timer_state.qemu_icount to take into
- * account executed instructions. This is done by the TCG vCPU
- * thread so the main-loop can see time has moved forward.
- */
-static void cpu_update_icount_locked(CPUState *cpu)
-{
-    int64_t executed =3D cpu_get_icount_executed(cpu);
-    cpu->icount_budget -=3D executed;
-
-    atomic_set_i64(&timers_state.qemu_icount,
-                   timers_state.qemu_icount + executed);
-}
-
-/*
- * Update the global shared timer_state.qemu_icount to take into
- * account executed instructions. This is done by the TCG vCPU
- * thread so the main-loop can see time has moved forward.
- */
-void cpu_update_icount(CPUState *cpu)
-{
-    seqlock_write_lock(&timers_state.vm_clock_seqlock,
-                       &timers_state.vm_clock_lock);
-    cpu_update_icount_locked(cpu);
-    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
-                         &timers_state.vm_clock_lock);
-}
-
-static int64_t cpu_get_icount_raw_locked(void)
-{
-    CPUState *cpu =3D current_cpu;
-
-    if (cpu && cpu->running) {
-        if (!cpu->can_do_io) {
-            error_report("Bad icount read");
-            exit(1);
-        }
-        /* Take into account what has run */
-        cpu_update_icount_locked(cpu);
-    }
-    /* The read is protected by the seqlock, but needs atomic64 to avoid U=
B */
-    return atomic_read_i64(&timers_state.qemu_icount);
-}
-
-static int64_t cpu_get_icount_locked(void)
-{
-    int64_t icount =3D cpu_get_icount_raw_locked();
-    return atomic_read_i64(&timers_state.qemu_icount_bias) +
-        cpu_icount_to_ns(icount);
-}
-
-int64_t cpu_get_icount_raw(void)
-{
-    int64_t icount;
-    unsigned start;
-
-    do {
-        start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock);
-        icount =3D cpu_get_icount_raw_locked();
-    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
-
-    return icount;
-}
-
-/* Return the virtual CPU time, based on the instruction counter.  */
-int64_t cpu_get_icount(void)
-{
-    int64_t icount;
-    unsigned start;
-
-    do {
-        start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock);
-        icount =3D cpu_get_icount_locked();
-    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
-
-    return icount;
-}
-
-int64_t cpu_icount_to_ns(int64_t icount)
-{
-    return icount << atomic_read(&timers_state.icount_time_shift);
-}
-
-static int64_t cpu_get_ticks_locked(void)
-{
-    int64_t ticks =3D timers_state.cpu_ticks_offset;
-    if (timers_state.cpu_ticks_enabled) {
-        ticks +=3D cpu_get_host_ticks();
-    }
-
-    if (timers_state.cpu_ticks_prev > ticks) {
-        /* Non increasing ticks may happen if the host uses software suspe=
nd.  */
-        timers_state.cpu_ticks_offset +=3D timers_state.cpu_ticks_prev - t=
icks;
-        ticks =3D timers_state.cpu_ticks_prev;
-    }
-
-    timers_state.cpu_ticks_prev =3D ticks;
-    return ticks;
-}
-
-/* return the time elapsed in VM between vm_start and vm_stop.  Unless
- * icount is active, cpu_get_ticks() uses units of the host CPU cycle
- * counter.
- */
-int64_t cpu_get_ticks(void)
-{
-    int64_t ticks;
-
-    if (use_icount) {
-        return cpu_get_icount();
-    }
-
-    qemu_spin_lock(&timers_state.vm_clock_lock);
-    ticks =3D cpu_get_ticks_locked();
-    qemu_spin_unlock(&timers_state.vm_clock_lock);
-    return ticks;
-}
-
-static int64_t cpu_get_clock_locked(void)
-{
-    int64_t time;
-
-    time =3D timers_state.cpu_clock_offset;
-    if (timers_state.cpu_ticks_enabled) {
-        time +=3D get_clock();
-    }
-
-    return time;
-}
-
-/* Return the monotonic time elapsed in VM, i.e.,
- * the time between vm_start and vm_stop
- */
-int64_t cpu_get_clock(void)
-{
-    int64_t ti;
-    unsigned start;
-
-    do {
-        start =3D seqlock_read_begin(&timers_state.vm_clock_seqlock);
-        ti =3D cpu_get_clock_locked();
-    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
-
-    return ti;
-}
-
-/* enable cpu_get_ticks()
- * Caller must hold BQL which serves as mutex for vm_clock_seqlock.
- */
-void cpu_enable_ticks(void)
-{
-    seqlock_write_lock(&timers_state.vm_clock_seqlock,
-                       &timers_state.vm_clock_lock);
-    if (!timers_state.cpu_ticks_enabled) {
-        timers_state.cpu_ticks_offset -=3D cpu_get_host_ticks();
-        timers_state.cpu_clock_offset -=3D get_clock();
-        timers_state.cpu_ticks_enabled =3D 1;
-    }
-    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
-                       &timers_state.vm_clock_lock);
-}
-
-/* disable cpu_get_ticks() : the clock is stopped. You must not call
- * cpu_get_ticks() after that.
- * Caller must hold BQL which serves as mutex for vm_clock_seqlock.
- */
-void cpu_disable_ticks(void)
-{
-    seqlock_write_lock(&timers_state.vm_clock_seqlock,
-                       &timers_state.vm_clock_lock);
-    if (timers_state.cpu_ticks_enabled) {
-        timers_state.cpu_ticks_offset +=3D cpu_get_host_ticks();
-        timers_state.cpu_clock_offset =3D cpu_get_clock_locked();
-        timers_state.cpu_ticks_enabled =3D 0;
-    }
-    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
-                         &timers_state.vm_clock_lock);
-}
-
-/* Correlation between real and virtual time is always going to be
-   fairly approximate, so ignore small variation.
-   When the guest is idle real and virtual time will be aligned in
-   the IO wait loop.  */
-#define ICOUNT_WOBBLE (NANOSECONDS_PER_SECOND / 10)
-
-static void icount_adjust(void)
-{
-    int64_t cur_time;
-    int64_t cur_icount;
-    int64_t delta;
-
-    /* Protected by TimersState mutex.  */
-    static int64_t last_delta;
-
-    /* If the VM is not running, then do nothing.  */
-    if (!runstate_is_running()) {
-        return;
-    }
-
-    seqlock_write_lock(&timers_state.vm_clock_seqlock,
-                       &timers_state.vm_clock_lock);
-    cur_time =3D cpu_get_clock_locked();
-    cur_icount =3D cpu_get_icount_locked();
-
-    delta =3D cur_icount - cur_time;
-    /* FIXME: This is a very crude algorithm, somewhat prone to oscillatio=
n.  */
-    if (delta > 0
-        && last_delta + ICOUNT_WOBBLE < delta * 2
-        && timers_state.icount_time_shift > 0) {
-        /* The guest is getting too far ahead.  Slow time down.  */
-        atomic_set(&timers_state.icount_time_shift,
-                   timers_state.icount_time_shift - 1);
-    }
-    if (delta < 0
-        && last_delta - ICOUNT_WOBBLE > delta * 2
-        && timers_state.icount_time_shift < MAX_ICOUNT_SHIFT) {
-        /* The guest is getting too far behind.  Speed time up.  */
-        atomic_set(&timers_state.icount_time_shift,
-                   timers_state.icount_time_shift + 1);
-    }
-    last_delta =3D delta;
-    atomic_set_i64(&timers_state.qemu_icount_bias,
-                   cur_icount - (timers_state.qemu_icount
-                                 << timers_state.icount_time_shift));
-    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
-                         &timers_state.vm_clock_lock);
-}
-
-static void icount_adjust_rt(void *opaque)
-{
-    timer_mod(timers_state.icount_rt_timer,
-              qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) + 1000);
-    icount_adjust();
-}
-
-static void icount_adjust_vm(void *opaque)
-{
-    timer_mod(timers_state.icount_vm_timer,
-                   qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
-                   NANOSECONDS_PER_SECOND / 10);
-    icount_adjust();
-}
-
-static int64_t qemu_icount_round(int64_t count)
-{
-    int shift =3D atomic_read(&timers_state.icount_time_shift);
-    return (count + (1 << shift) - 1) >> shift;
-}
-
-static void icount_warp_rt(void)
-{
-    unsigned seq;
-    int64_t warp_start;
-
-    /* The icount_warp_timer is rescheduled soon after vm_clock_warp_start
-     * changes from -1 to another value, so the race here is okay.
-     */
-    do {
-        seq =3D seqlock_read_begin(&timers_state.vm_clock_seqlock);
-        warp_start =3D timers_state.vm_clock_warp_start;
-    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, seq));
-
-    if (warp_start =3D=3D -1) {
-        return;
-    }
-
-    seqlock_write_lock(&timers_state.vm_clock_seqlock,
-                       &timers_state.vm_clock_lock);
-    if (runstate_is_running()) {
-        int64_t clock =3D REPLAY_CLOCK_LOCKED(REPLAY_CLOCK_VIRTUAL_RT,
-                                            cpu_get_clock_locked());
-        int64_t warp_delta;
-
-        warp_delta =3D clock - timers_state.vm_clock_warp_start;
-        if (use_icount =3D=3D 2) {
-            /*
-             * In adaptive mode, do not let QEMU_CLOCK_VIRTUAL run too
-             * far ahead of real time.
-             */
-            int64_t cur_icount =3D cpu_get_icount_locked();
-            int64_t delta =3D clock - cur_icount;
-            warp_delta =3D MIN(warp_delta, delta);
-        }
-        atomic_set_i64(&timers_state.qemu_icount_bias,
-                       timers_state.qemu_icount_bias + warp_delta);
-    }
-    timers_state.vm_clock_warp_start =3D -1;
-    seqlock_write_unlock(&timers_state.vm_clock_seqlock,
-                       &timers_state.vm_clock_lock);
-
-    if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) {
-        qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
-    }
-}
-
-static void icount_timer_cb(void *opaque)
-{
-    /* No need for a checkpoint because the timer already synchronizes
-     * with CHECKPOINT_CLOCK_VIRTUAL_RT.
-     */
-    icount_warp_rt();
-}
-
-void qtest_clock_warp(int64_t dest)
-{
-    int64_t clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-    AioContext *aio_context;
-    assert(qtest_enabled());
-    aio_context =3D qemu_get_aio_context();
-    while (clock < dest) {
-        int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
-                                                      QEMU_TIMER_ATTR_ALL);
-        int64_t warp =3D qemu_soonest_timeout(dest - clock, deadline);
-
-        seqlock_write_lock(&timers_state.vm_clock_seqlock,
-                           &timers_state.vm_clock_lock);
-        atomic_set_i64(&timers_state.qemu_icount_bias,
-                       timers_state.qemu_icount_bias + warp);
-        seqlock_write_unlock(&timers_state.vm_clock_seqlock,
-                             &timers_state.vm_clock_lock);
-
-        qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
-        timerlist_run_timers(aio_context->tlg.tl[QEMU_CLOCK_VIRTUAL]);
-        clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-    }
-    qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
-}
-
-void qemu_start_warp_timer(void)
-{
-    int64_t clock;
-    int64_t deadline;
-
-    if (!use_icount) {
-        return;
-    }
-
-    /* Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers
-     * do not fire, so computing the deadline does not make sense.
-     */
-    if (!runstate_is_running()) {
-        return;
-    }
-
-    if (replay_mode !=3D REPLAY_MODE_PLAY) {
-        if (!all_cpu_threads_idle()) {
-            return;
-        }
-
-        if (qtest_enabled()) {
-            /* When testing, qtest commands advance icount.  */
-            return;
-        }
-
-        replay_checkpoint(CHECKPOINT_CLOCK_WARP_START);
-    } else {
-        /* warp clock deterministically in record/replay mode */
-        if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_START)) {
-            /* vCPU is sleeping and warp can't be started.
-               It is probably a race condition: notification sent
-               to vCPU was processed in advance and vCPU went to sleep.
-               Therefore we have to wake it up for doing someting. */
-            if (replay_has_checkpoint()) {
-                qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
-            }
-            return;
-        }
-    }
-
-    /* We want to use the earliest deadline from ALL vm_clocks */
-    clock =3D qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT);
-    deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
-                                          ~QEMU_TIMER_ATTR_EXTERNAL);
-    if (deadline < 0) {
-        static bool notified;
-        if (!icount_sleep && !notified) {
-            warn_report("icount sleep disabled and no active timers");
-            notified =3D true;
-        }
-        return;
-    }
-
-    if (deadline > 0) {
-        /*
-         * Ensure QEMU_CLOCK_VIRTUAL proceeds even when the virtual CPU go=
es to
-         * sleep.  Otherwise, the CPU might be waiting for a future timer
-         * interrupt to wake it up, but the interrupt never comes because
-         * the vCPU isn't running any insns and thus doesn't advance the
-         * QEMU_CLOCK_VIRTUAL.
-         */
-        if (!icount_sleep) {
-            /*
-             * We never let VCPUs sleep in no sleep icount mode.
-             * If there is a pending QEMU_CLOCK_VIRTUAL timer we just adva=
nce
-             * to the next QEMU_CLOCK_VIRTUAL event and notify it.
-             * It is useful when we want a deterministic execution time,
-             * isolated from host latencies.
-             */
-            seqlock_write_lock(&timers_state.vm_clock_seqlock,
-                               &timers_state.vm_clock_lock);
-            atomic_set_i64(&timers_state.qemu_icount_bias,
-                           timers_state.qemu_icount_bias + deadline);
-            seqlock_write_unlock(&timers_state.vm_clock_seqlock,
-                                 &timers_state.vm_clock_lock);
-            qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
-        } else {
-            /*
-             * We do stop VCPUs and only advance QEMU_CLOCK_VIRTUAL after =
some
-             * "real" time, (related to the time left until the next event=
) has
-             * passed. The QEMU_CLOCK_VIRTUAL_RT clock will do this.
-             * This avoids that the warps are visible externally; for exam=
ple,
-             * you will not be sending network packets continuously instea=
d of
-             * every 100ms.
-             */
-            seqlock_write_lock(&timers_state.vm_clock_seqlock,
-                               &timers_state.vm_clock_lock);
-            if (timers_state.vm_clock_warp_start =3D=3D -1
-                || timers_state.vm_clock_warp_start > clock) {
-                timers_state.vm_clock_warp_start =3D clock;
-            }
-            seqlock_write_unlock(&timers_state.vm_clock_seqlock,
-                                 &timers_state.vm_clock_lock);
-            timer_mod_anticipate(timers_state.icount_warp_timer,
-                                 clock + deadline);
-        }
-    } else if (deadline =3D=3D 0) {
-        qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
-    }
-}
-
-static void qemu_account_warp_timer(void)
-{
-    if (!use_icount || !icount_sleep) {
-        return;
-    }
-
-    /* Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers
-     * do not fire, so computing the deadline does not make sense.
-     */
-    if (!runstate_is_running()) {
-        return;
-    }
-
-    /* warp clock deterministically in record/replay mode */
-    if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_ACCOUNT)) {
-        return;
-    }
-
-    timer_del(timers_state.icount_warp_timer);
-    icount_warp_rt();
-}
-
-static bool icount_state_needed(void *opaque)
-{
-    return use_icount;
-}
-
-static bool warp_timer_state_needed(void *opaque)
-{
-    TimersState *s =3D opaque;
-    return s->icount_warp_timer !=3D NULL;
-}
-
-static bool adjust_timers_state_needed(void *opaque)
-{
-    TimersState *s =3D opaque;
-    return s->icount_rt_timer !=3D NULL;
-}
-
-/*
- * Subsection for warp timer migration is optional, because may not be cre=
ated
- */
-static const VMStateDescription icount_vmstate_warp_timer =3D {
-    .name =3D "timer/icount/warp_timer",
-    .version_id =3D 1,
-    .minimum_version_id =3D 1,
-    .needed =3D warp_timer_state_needed,
-    .fields =3D (VMStateField[]) {
-        VMSTATE_INT64(vm_clock_warp_start, TimersState),
-        VMSTATE_TIMER_PTR(icount_warp_timer, TimersState),
-        VMSTATE_END_OF_LIST()
-    }
-};
-
-static const VMStateDescription icount_vmstate_adjust_timers =3D {
-    .name =3D "timer/icount/timers",
-    .version_id =3D 1,
-    .minimum_version_id =3D 1,
-    .needed =3D adjust_timers_state_needed,
-    .fields =3D (VMStateField[]) {
-        VMSTATE_TIMER_PTR(icount_rt_timer, TimersState),
-        VMSTATE_TIMER_PTR(icount_vm_timer, TimersState),
-        VMSTATE_END_OF_LIST()
-    }
-};
-
-/*
- * This is a subsection for icount migration.
- */
-static const VMStateDescription icount_vmstate_timers =3D {
-    .name =3D "timer/icount",
-    .version_id =3D 1,
-    .minimum_version_id =3D 1,
-    .needed =3D icount_state_needed,
-    .fields =3D (VMStateField[]) {
-        VMSTATE_INT64(qemu_icount_bias, TimersState),
-        VMSTATE_INT64(qemu_icount, TimersState),
-        VMSTATE_END_OF_LIST()
-    },
-    .subsections =3D (const VMStateDescription*[]) {
-        &icount_vmstate_warp_timer,
-        &icount_vmstate_adjust_timers,
-        NULL
-    }
-};
-
-static const VMStateDescription vmstate_timers =3D {
-    .name =3D "timer",
-    .version_id =3D 2,
-    .minimum_version_id =3D 1,
-    .fields =3D (VMStateField[]) {
-        VMSTATE_INT64(cpu_ticks_offset, TimersState),
-        VMSTATE_UNUSED(8),
-        VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2),
-        VMSTATE_END_OF_LIST()
-    },
-    .subsections =3D (const VMStateDescription*[]) {
-        &icount_vmstate_timers,
-        NULL
-    }
-};
-
-void cpu_ticks_init(void)
-{
-    seqlock_init(&timers_state.vm_clock_seqlock);
-    qemu_spin_init(&timers_state.vm_clock_lock);
-    vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
-    cpu_throttle_init();
-}
-
-void configure_icount(QemuOpts *opts, Error **errp)
-{
-    const char *option =3D qemu_opt_get(opts, "shift");
-    bool sleep =3D qemu_opt_get_bool(opts, "sleep", true);
-    bool align =3D qemu_opt_get_bool(opts, "align", false);
-    long time_shift =3D -1;
-
-    if (!option && qemu_opt_get(opts, "align")) {
-        error_setg(errp, "Please specify shift option when using align");
-        return;
-    }
-
-    if (align && !sleep) {
-        error_setg(errp, "align=3Don and sleep=3Doff are incompatible");
-        return;
-    }
-
-    if (strcmp(option, "auto") !=3D 0) {
-        if (qemu_strtol(option, NULL, 0, &time_shift) < 0
-            || time_shift < 0 || time_shift > MAX_ICOUNT_SHIFT) {
-            error_setg(errp, "icount: Invalid shift value");
-            return;
-        }
-    } else if (icount_align_option) {
-        error_setg(errp, "shift=3Dauto and align=3Don are incompatible");
-        return;
-    } else if (!icount_sleep) {
-        error_setg(errp, "shift=3Dauto and sleep=3Doff are incompatible");
-        return;
-    }
-
-    icount_sleep =3D sleep;
-    if (icount_sleep) {
-        timers_state.icount_warp_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL=
_RT,
-                                         icount_timer_cb, NULL);
-    }
-
-    icount_align_option =3D align;
-
-    if (time_shift >=3D 0) {
-        timers_state.icount_time_shift =3D time_shift;
-        use_icount =3D 1;
-        return;
-    }
-
-    use_icount =3D 2;
-
-    /* 125MIPS seems a reasonable initial guess at the guest speed.
-       It will be corrected fairly quickly anyway.  */
-    timers_state.icount_time_shift =3D 3;
-
-    /* Have both realtime and virtual time triggers for speed adjustment.
-       The realtime trigger catches emulated time passing too slowly,
-       the virtual time trigger catches emulated time passing too fast.
-       Realtime triggers occur even when idle, so use them less frequently
-       than VM triggers.  */
-    timers_state.vm_clock_warp_start =3D -1;
-    timers_state.icount_rt_timer =3D timer_new_ms(QEMU_CLOCK_VIRTUAL_RT,
-                                   icount_adjust_rt, NULL);
-    timer_mod(timers_state.icount_rt_timer,
-                   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) + 1000);
-    timers_state.icount_vm_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL,
-                                        icount_adjust_vm, NULL);
-    timer_mod(timers_state.icount_vm_timer,
-                   qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
-                   NANOSECONDS_PER_SECOND / 10);
-}
-
 /***********************************************************/
 /* TCG vCPU kick timer
  *
@@ -824,35 +160,6 @@ static void qemu_cpu_kick_rr_cpus(void)
     };
 }
=20
-static void do_nothing(CPUState *cpu, run_on_cpu_data unused)
-{
-}
-
-void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
-{
-    if (!use_icount || type !=3D QEMU_CLOCK_VIRTUAL) {
-        qemu_notify_event();
-        return;
-    }
-
-    if (qemu_in_vcpu_thread()) {
-        /* A CPU is currently running; kick it back out to the
-         * tcg_cpu_exec() loop so it will recalculate its
-         * icount deadline immediately.
-         */
-        qemu_cpu_kick(current_cpu);
-    } else if (first_cpu) {
-        /* qemu_cpu_kick is not enough to kick a halted CPU out of
-         * qemu_tcg_wait_io_event.  async_run_on_cpu, instead,
-         * causes cpu_thread_is_idle to return false.  This way,
-         * handle_icount_deadline can run.
-         * If we have no CPUs at all for some reason, we don't
-         * need to do anything.
-         */
-        async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL);
-    }
-}
-
 static void kick_tcg_thread(void *opaque)
 {
     timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
@@ -1254,7 +561,7 @@ static int64_t tcg_get_icount_limit(void)
             deadline =3D INT32_MAX;
         }
=20
-        return qemu_icount_round(deadline);
+        return icount_round(deadline);
     } else {
         return replay_get_instructions();
     }
@@ -1263,7 +570,7 @@ static int64_t tcg_get_icount_limit(void)
 static void handle_icount_deadline(void)
 {
     assert(qemu_in_vcpu_thread());
-    if (use_icount) {
+    if (icount_enabled()) {
         int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
                                                       QEMU_TIMER_ATTR_ALL);
=20
@@ -1277,7 +584,7 @@ static void handle_icount_deadline(void)
=20
 static void prepare_icount_for_run(CPUState *cpu)
 {
-    if (use_icount) {
+    if (icount_enabled()) {
         int insns_left;
=20
         /* These should always be cleared by process_icount_data after
@@ -1298,9 +605,9 @@ static void prepare_icount_for_run(CPUState *cpu)
=20
 static void process_icount_data(CPUState *cpu)
 {
-    if (use_icount) {
+    if (icount_enabled()) {
         /* Account for executed instructions */
-        cpu_update_icount(cpu);
+        icount_update(cpu);
=20
         /* Reset the counters */
         cpu_neg(cpu)->icount_decr.u16.low =3D 0;
@@ -1401,7 +708,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
         replay_mutex_lock();
         qemu_mutex_lock_iothread();
         /* Account partial waits to QEMU_CLOCK_VIRTUAL.  */
-        qemu_account_warp_timer();
+        icount_account_warp_timer();
=20
         /* Run the timers here.  This is much more efficient than
          * waking up the I/O thread and waiting for completion.
@@ -1459,7 +766,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
             atomic_mb_set(&cpu->exit_request, 0);
         }
=20
-        if (use_icount && all_cpu_threads_idle()) {
+        if (icount_enabled() && all_cpu_threads_idle()) {
             /*
              * When all cpus are sleeping (e.g in WFI), to avoid a deadlock
              * in the main_loop, wake it up in order to start the warp tim=
er.
@@ -1612,7 +919,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
     CPUState *cpu =3D arg;
=20
     assert(tcg_enabled());
-    g_assert(!use_icount);
+    g_assert(!icount_enabled());
=20
     rcu_register_thread();
     tcg_register_thread();
@@ -2191,21 +1498,3 @@ void qmp_inject_nmi(Error **errp)
     nmi_monitor_handle(monitor_get_cpu_index(), errp);
 }
=20
-void dump_drift_info(void)
-{
-    if (!use_icount) {
-        return;
-    }
-
-    qemu_printf("Host - Guest clock  %"PRIi64" ms\n",
-                (cpu_get_clock() - cpu_get_icount())/SCALE_MS);
-    if (icount_align_option) {
-        qemu_printf("Max guest delay     %"PRIi64" ms\n",
-                    -max_delay / SCALE_MS);
-        qemu_printf("Max guest advance   %"PRIi64" ms\n",
-                    max_advance / SCALE_MS);
-    } else {
-        qemu_printf("Max guest delay     NA\n");
-        qemu_printf("Max guest advance   NA\n");
-    }
-}
diff --git a/docs/replay.txt b/docs/replay.txt
index 70c27edb36..8952e6d852 100644
--- a/docs/replay.txt
+++ b/docs/replay.txt
@@ -184,11 +184,11 @@ is then incremented (which is called "warping" the vi=
rtual clock) as
 soon as the timer fires or the CPUs need to go out of the idle state.
 Two functions are used for this purpose; because these actions change
 virtual machine state and must be deterministic, each of them creates a
-checkpoint.  qemu_start_warp_timer checks if the CPUs are idle and if so
-starts accounting real time to virtual clock.  qemu_account_warp_timer
+checkpoint.  icount_start_warp_timer checks if the CPUs are idle and if so
+starts accounting real time to virtual clock.  icount_account_warp_timer
 is called when the CPUs get an interrupt or when the warp timer fires,
 and it warps the virtual clock by the amount of real time that has passed
-since qemu_start_warp_timer.
+since icount_start_warp_timer.
=20
 Bottom halves
 -------------
diff --git a/exec.c b/exec.c
index 5162f0d12f..db9a90469b 100644
--- a/exec.c
+++ b/exec.c
@@ -104,10 +104,6 @@ uintptr_t qemu_host_page_size;
 intptr_t qemu_host_page_mask;
=20
 #if !defined(CONFIG_USER_ONLY)
-/* 0 =3D Do not count executed instructions.
-   1 =3D Precise instruction counting.
-   2 =3D Adaptive rate instruction counting.  */
-int use_icount;
=20
 typedef struct PhysPageEntry PhysPageEntry;
=20
diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index b5a54e2536..6c9f33208a 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -7,11 +7,11 @@
  */
=20
 #include "qemu/osdep.h"
-#include "qemu/timer.h"
 #include "hw/ptimer.h"
 #include "migration/vmstate.h"
 #include "qemu/host-utils.h"
 #include "sysemu/replay.h"
+#include "sysemu/cpu-timers.h"
 #include "sysemu/qtest.h"
 #include "block/aio.h"
 #include "sysemu/cpus.h"
@@ -134,7 +134,7 @@ static void ptimer_reload(ptimer_state *s, int delta_ad=
just)
      * on the current generation of host machines.
      */
=20
-    if (s->enabled =3D=3D 1 && (delta * period < 10000) && !use_icount) {
+    if (s->enabled =3D=3D 1 && (delta * period < 10000) && !icount_enabled=
()) {
         period =3D 10000 / delta;
         period_frac =3D 0;
     }
@@ -217,7 +217,7 @@ uint64_t ptimer_get_count(ptimer_state *s)
             uint32_t period_frac =3D s->period_frac;
             uint64_t period =3D s->period;
=20
-            if (!oneshot && (s->delta * period < 10000) && !use_icount) {
+            if (!oneshot && (s->delta * period < 10000) && !icount_enabled=
()) {
                 period =3D 10000 / s->delta;
                 period_frac =3D 0;
             }
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 7a3bc7ab66..002b3cabc2 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -34,6 +34,7 @@
 #include "sysemu/numa.h"
 #include "sysemu/replay.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/cpu-timers.h"
 #include "trace.h"
=20
 #include "hw/i386/x86.h"
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index d14374bdd4..49eedd714d 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -409,8 +409,12 @@ static inline bool tlb_hit(target_ulong tlb_addr, targ=
et_ulong addr)
     return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
 }
=20
+#ifdef CONFIG_TCG
+void dump_drift_info(void);
 void dump_exec_info(void);
 void dump_opcount_info(void);
+#endif /* CONFIG_TCG */
+
 #endif /* !CONFIG_USER_ONLY */
=20
 int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 8792bea07a..c1f51e37af 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -25,7 +25,7 @@
 #ifdef CONFIG_TCG
 #include "exec/cpu_ldst.h"
 #endif
-#include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
=20
 /* allow to see translation results - the slowdown should be negligible, s=
o we leave it */
 #define DEBUG_DISAS
@@ -489,7 +489,7 @@ static inline uint32_t tb_cflags(const TranslationBlock=
 *tb)
 static inline uint32_t curr_cflags(void)
 {
     return (parallel_cpus ? CF_PARALLEL : 0)
-         | (use_icount ? CF_USE_ICOUNT : 0);
+         | (icount_enabled() ? CF_USE_ICOUNT : 0);
 }
=20
 /* TranslationBlock invalidate API */
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 6a8b48b5a9..c54b1b2813 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -224,13 +224,6 @@ void qemu_clock_notify(QEMUClockType type);
  */
 void qemu_clock_enable(QEMUClockType type, bool enabled);
=20
-/**
- * qemu_start_warp_timer:
- *
- * Starts a timer for virtual clock update
- */
-void qemu_start_warp_timer(void);
-
 /**
  * qemu_clock_run_timers:
  * @type: clock on which to operate
@@ -791,12 +784,6 @@ static inline int64_t qemu_soonest_timeout(int64_t tim=
eout1, int64_t timeout2)
  */
 void init_clocks(QEMUTimerListNotifyCB *notify_cb);
=20
-int64_t cpu_get_ticks(void);
-/* Caller must hold BQL */
-void cpu_enable_ticks(void);
-/* Caller must hold BQL */
-void cpu_disable_ticks(void);
-
 static inline int64_t get_max_clock_jump(void)
 {
     /* This should be small enough to prevent excessive interrupts from be=
ing
@@ -850,13 +837,6 @@ static inline int64_t get_clock(void)
 }
 #endif
=20
-/* icount */
-int64_t cpu_get_icount_raw(void);
-int64_t cpu_get_icount(void);
-int64_t cpu_get_clock(void);
-int64_t cpu_icount_to_ns(int64_t icount);
-void    cpu_update_icount(CPUState *cpu);
-
 /*******************************************/
 /* host CPU ticks (if available) */
=20
diff --git a/include/sysemu/cpu-timers.h b/include/sysemu/cpu-timers.h
new file mode 100644
index 0000000000..3db579fde7
--- /dev/null
+++ b/include/sysemu/cpu-timers.h
@@ -0,0 +1,73 @@
+#ifndef SYSEMU_CPU_TIMERS_H
+#define SYSEMU_CPU_TIMERS_H
+
+#include "qemu/timer.h"
+
+/* init the whole cpu timers API, including icount, ticks, and cpu_throttl=
e */
+void cpu_timers_init(void);
+
+/* icount - Instruction Counter API */
+
+/*
+ * Return the icount enablement state:
+ *
+ * 0 =3D Disabled - Do not count executed instructions.
+ * 1 =3D Enabled - Fixed conversion of insn to ns via "shift" option
+ * 2 =3D Enabled - Runtime adaptive algorithm to compute shift
+ */
+int icount_enabled(void);
+/*
+ * Update the icount with the executed instructions. Called by
+ * cpus-tcg vCPU thread so the main-loop can see time has moved forward.
+ */
+void icount_update(CPUState *cpu);
+
+/* get raw icount value */
+int64_t icount_get_raw(void);
+
+/* return the virtual CPU time in ns, based on the instruction counter. */
+int64_t icount_get(void);
+/*
+ * convert an instruction counter value to ns, based on the icount shift.
+ * This shift is set as a fixed value with the icount "shift" option
+ * (precise mode), or it is constantly approximated and corrected at
+ * runtime in adaptive mode.
+ */
+int64_t icount_to_ns(int64_t icount);
+
+/* configure the icount options, including "shift" */
+void icount_configure(QemuOpts *opts, Error **errp);
+
+/* used by tcg vcpu thread to calc icount budget */
+int64_t icount_round(int64_t count);
+
+/* if the CPUs are idle, start accounting real time to virtual clock. */
+void icount_start_warp_timer(void);
+void icount_account_warp_timer(void);
+
+/*
+ * CPU Ticks and Clock
+ */
+
+/* Caller must hold BQL */
+void cpu_enable_ticks(void);
+/* Caller must hold BQL */
+void cpu_disable_ticks(void);
+
+/*
+ * return the time elapsed in VM between vm_start and vm_stop.  Unless
+ * icount is active, cpu_get_ticks() uses units of the host CPU cycle
+ * counter.
+ */
+int64_t cpu_get_ticks(void);
+
+/*
+ * Returns the monotonic time elapsed in VM, i.e.,
+ * the time between vm_start and vm_stop
+ */
+int64_t cpu_get_clock(void);
+
+void qemu_timer_notify_cb(void *opaque, QEMUClockType type);
+void qtest_clock_warp(int64_t dest);
+
+#endif /* SYSEMU_CPU_TIMERS_H */
diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
index 3c1da6a018..149de000a0 100644
--- a/include/sysemu/cpus.h
+++ b/include/sysemu/cpus.h
@@ -4,33 +4,23 @@
 #include "qemu/timer.h"
=20
 /* cpus.c */
+bool all_cpu_threads_idle(void);
 bool qemu_in_vcpu_thread(void);
 void qemu_init_cpu_loop(void);
 void resume_all_vcpus(void);
 void pause_all_vcpus(void);
 void cpu_stop_current(void);
-void cpu_ticks_init(void);
=20
-void configure_icount(QemuOpts *opts, Error **errp);
-extern int use_icount;
 extern int icount_align_option;
=20
-/* drift information for info jit command */
-extern int64_t max_delay;
-extern int64_t max_advance;
-void dump_drift_info(void);
-
 /* Unblock cpu */
 void qemu_cpu_kick_self(void);
-void qemu_timer_notify_cb(void *opaque, QEMUClockType type);
=20
 void cpu_synchronize_all_states(void);
 void cpu_synchronize_all_post_reset(void);
 void cpu_synchronize_all_post_init(void);
 void cpu_synchronize_all_pre_loadvm(void);
=20
-void qtest_clock_warp(int64_t dest);
-
 #ifndef CONFIG_USER_ONLY
 /* vl.c */
 /* *-user doesn't have configurable SMP topology */
diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
index 5471bb514d..a140d69a73 100644
--- a/include/sysemu/replay.h
+++ b/include/sysemu/replay.h
@@ -109,12 +109,12 @@ int64_t replay_read_clock(ReplayClockKind kind);
 #define REPLAY_CLOCK(clock, value)                                      \
     (replay_mode =3D=3D REPLAY_MODE_PLAY ? replay_read_clock((clock))     =
  \
         : replay_mode =3D=3D REPLAY_MODE_RECORD                           =
  \
-            ? replay_save_clock((clock), (value), cpu_get_icount_raw()) \
+            ? replay_save_clock((clock), (value), icount_get_raw()) \
         : (value))
 #define REPLAY_CLOCK_LOCKED(clock, value)                               \
     (replay_mode =3D=3D REPLAY_MODE_PLAY ? replay_read_clock((clock))     =
  \
         : replay_mode =3D=3D REPLAY_MODE_RECORD                           =
  \
-            ? replay_save_clock((clock), (value), cpu_get_icount_raw_locke=
d()) \
+            ? replay_save_clock((clock), (value), icount_get_raw_locked())=
 \
         : (value))
=20
 /* Processing data from random generators */
diff --git a/qtest.c b/qtest.c
index 5672b75c35..a1b92853c9 100644
--- a/qtest.c
+++ b/qtest.c
@@ -21,7 +21,7 @@
 #include "exec/memory.h"
 #include "hw/irq.h"
 #include "sysemu/accel.h"
-#include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
 #include "qemu/config-file.h"
 #include "qemu/option.h"
 #include "qemu/error-report.h"
diff --git a/replay/replay.c b/replay/replay.c
index 706c7b4f4b..9896a3b6f5 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -11,10 +11,10 @@
=20
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "sysemu/cpu-timers.h"
 #include "sysemu/replay.h"
 #include "sysemu/runstate.h"
 #include "replay-internal.h"
-#include "qemu/timer.h"
 #include "qemu/main-loop.h"
 #include "qemu/option.h"
 #include "sysemu/cpus.h"
@@ -64,7 +64,7 @@ bool replay_next_event_is(int event)
=20
 uint64_t replay_get_current_icount(void)
 {
-    return cpu_get_icount_raw();
+    return icount_get_raw();
 }
=20
 int replay_get_instructions(void)
@@ -345,7 +345,7 @@ void replay_start(void)
         error_reportf_err(replay_blockers->data, "Record/replay: ");
         exit(1);
     }
-    if (!use_icount) {
+    if (!icount_enabled()) {
         error_report("Please enable icount to use record/replay");
         exit(1);
     }
diff --git a/softmmu/vl.c b/softmmu/vl.c
index ae5451bc23..ed53cd1b62 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -73,6 +73,7 @@
 #include "hw/audio/soundhw.h"
 #include "audio/audio.h"
 #include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
 #include "migration/colo.h"
 #include "migration/postcopy-ram.h"
 #include "sysemu/kvm.h"
@@ -2675,7 +2676,7 @@ static void user_register_global_props(void)
=20
 static int do_configure_icount(void *opaque, QemuOpts *opts, Error **errp)
 {
-    configure_icount(opts, errp);
+    icount_configure(opts, errp);
     return 0;
 }
=20
@@ -2785,7 +2786,7 @@ static void configure_accelerators(const char *progna=
me)
         error_report("falling back to %s", ac->name);
     }
=20
-    if (use_icount && !(tcg_enabled() || qtest_enabled())) {
+    if (icount_enabled() && !(tcg_enabled() || qtest_enabled())) {
         error_report("-icount is not allowed with hardware virtualization"=
);
         exit(1);
     }
@@ -4233,7 +4234,8 @@ void qemu_init(int argc, char **argv, char **envp)
     /* spice needs the timers to be initialized by this point */
     qemu_spice_init();
=20
-    cpu_ticks_init();
+    /* initialize cpu timers and VCPU throttle modules */
+    cpu_timers_init();
=20
     if (default_net) {
         QemuOptsList *net =3D qemu_find_opts("net");
diff --git a/stubs/clock-warp.c b/stubs/clock-warp.c
index b53e5dd94c..304da5091c 100644
--- a/stubs/clock-warp.c
+++ b/stubs/clock-warp.c
@@ -1,7 +1,7 @@
 #include "qemu/osdep.h"
-#include "qemu/timer.h"
+#include "sysemu/cpu-timers.h"
=20
-void qemu_start_warp_timer(void)
+void icount_start_warp_timer(void)
 {
 }
=20
diff --git a/stubs/cpu-get-clock.c b/stubs/cpu-get-clock.c
index 5a92810e87..6102338743 100644
--- a/stubs/cpu-get-clock.c
+++ b/stubs/cpu-get-clock.c
@@ -1,5 +1,5 @@
 #include "qemu/osdep.h"
-#include "qemu/timer.h"
+#include "sysemu/cpu-timers.h"
=20
 int64_t cpu_get_clock(void)
 {
diff --git a/stubs/cpu-get-icount.c b/stubs/cpu-get-icount.c
index b35f844638..23f9154ef3 100644
--- a/stubs/cpu-get-icount.c
+++ b/stubs/cpu-get-icount.c
@@ -1,20 +1,22 @@
 #include "qemu/osdep.h"
-#include "qemu/timer.h"
-#include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
 #include "qemu/main-loop.h"
=20
-int use_icount;
-
-int64_t cpu_get_icount(void)
+int64_t icount_get(void)
 {
     abort();
 }
=20
-int64_t cpu_get_icount_raw(void)
+int64_t icount_get_raw(void)
 {
     abort();
 }
=20
+int icount_enabled(void)
+{
+    return 0;
+}
+
 void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
 {
     qemu_notify_event();
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 8870284f57..36be602179 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -20,6 +20,7 @@
 #include "qemu/osdep.h"
 #include "cpu.h"
 #include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
 #include "disas/disas.h"
 #include "qemu/host-utils.h"
 #include "exec/exec-all.h"
@@ -1329,7 +1330,7 @@ static DisasJumpType gen_mfpr(DisasContext *ctx, TCGv=
 va, int regno)
     case 249: /* VMTIME */
         helper =3D gen_helper_get_vmtime;
     do_helper:
-        if (use_icount) {
+        if (icount_enabled()) {
             gen_io_start();
             helper(va);
             return DISAS_PC_STALE;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index a92ae55672..c9f99f7952 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -24,6 +24,7 @@
 #include "hw/irq.h"
 #include "hw/semihosting/semihost.h"
 #include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
 #include "sysemu/kvm.h"
 #include "sysemu/tcg.h"
 #include "qemu/range.h"
@@ -1205,17 +1206,17 @@ static int64_t cycles_ns_per(uint64_t cycles)
=20
 static bool instructions_supported(CPUARMState *env)
 {
-    return use_icount =3D=3D 1 /* Precise instruction counting */;
+    return icount_enabled() =3D=3D 1; /* Precise instruction counting */
 }
=20
 static uint64_t instructions_get_count(CPUARMState *env)
 {
-    return (uint64_t)cpu_get_icount_raw();
+    return (uint64_t)icount_get_raw();
 }
=20
 static int64_t instructions_ns_per(uint64_t icount)
 {
-    return cpu_icount_to_ns((int64_t)icount);
+    return icount_to_ns((int64_t)icount);
 }
 #endif
=20
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 11d184cd16..6093f73e3a 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -194,8 +194,8 @@ static int write_fcsr(CPURISCVState *env, int csrno, ta=
rget_ulong val)
 static int read_instret(CPURISCVState *env, int csrno, target_ulong *val)
 {
 #if !defined(CONFIG_USER_ONLY)
-    if (use_icount) {
-        *val =3D cpu_get_icount();
+    if (icount_enabled()) {
+        *val =3D icount_get();
     } else {
         *val =3D cpu_get_host_ticks();
     }
@@ -209,8 +209,8 @@ static int read_instret(CPURISCVState *env, int csrno, =
target_ulong *val)
 static int read_instreth(CPURISCVState *env, int csrno, target_ulong *val)
 {
 #if !defined(CONFIG_USER_ONLY)
-    if (use_icount) {
-        *val =3D cpu_get_icount() >> 32;
+    if (icount_enabled()) {
+        *val =3D icount_get() >> 32;
     } else {
         *val =3D cpu_get_host_ticks() >> 32;
     }
diff --git a/tests/ptimer-test-stubs.c b/tests/ptimer-test-stubs.c
index ed393d9082..320dcf99b7 100644
--- a/tests/ptimer-test-stubs.c
+++ b/tests/ptimer-test-stubs.c
@@ -12,6 +12,7 @@
 #include "qemu/main-loop.h"
 #include "sysemu/replay.h"
 #include "migration/vmstate.h"
+#include "sysemu/cpu-timers.h"
=20
 #include "ptimer-test.h"
=20
@@ -126,3 +127,8 @@ void replay_bh_schedule_event(QEMUBH *bh)
 {
     bh->cb(bh->opaque);
 }
+
+int icount_enabled(void)
+{
+    return 0;
+}
diff --git a/tests/test-timed-average.c b/tests/test-timed-average.c
index e2bcf5fe13..82c92500df 100644
--- a/tests/test-timed-average.c
+++ b/tests/test-timed-average.c
@@ -11,7 +11,7 @@
  */
=20
 #include "qemu/osdep.h"
-
+#include "sysemu/cpu-timers.h"
 #include "qemu/timed-average.h"
=20
 /* This is the clock for QEMU_CLOCK_VIRTUAL */
diff --git a/util/main-loop.c b/util/main-loop.c
index eda63fe4e0..f1af697572 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -27,7 +27,7 @@
 #include "qemu/cutils.h"
 #include "qemu/timer.h"
 #include "sysemu/qtest.h"
-#include "sysemu/cpus.h"
+#include "sysemu/cpu-timers.h"
 #include "sysemu/replay.h"
 #include "qemu/main-loop.h"
 #include "block/aio.h"
@@ -521,7 +521,7 @@ void main_loop_wait(int nonblocking)
=20
     /* CPU thread can infinitely wait for event after
        missing the warp */
-    qemu_start_warp_timer();
+    icount_start_warp_timer();
     qemu_clock_run_all_timers();
 }
=20
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index b6575a2cd5..da2883f914 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -26,6 +26,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/timer.h"
 #include "qemu/lockable.h"
+#include "sysemu/cpu-timers.h"
 #include "sysemu/replay.h"
 #include "sysemu/cpus.h"
=20
@@ -134,7 +135,7 @@ static void qemu_clock_init(QEMUClockType type, QEMUTim=
erListNotifyCB *notify_cb
=20
 bool qemu_clock_use_for_deadline(QEMUClockType type)
 {
-    return !(use_icount && (type =3D=3D QEMU_CLOCK_VIRTUAL));
+    return !(icount_enabled() && (type =3D=3D QEMU_CLOCK_VIRTUAL));
 }
=20
 void qemu_clock_notify(QEMUClockType type)
@@ -417,7 +418,7 @@ static void timerlist_rearm(QEMUTimerList *timer_list)
 {
     /* Interrupt execution to force deadline recalculation.  */
     if (timer_list->clock->type =3D=3D QEMU_CLOCK_VIRTUAL) {
-        qemu_start_warp_timer();
+        icount_start_warp_timer();
     }
     timerlist_notify(timer_list);
 }
@@ -647,8 +648,8 @@ int64_t qemu_clock_get_ns(QEMUClockType type)
         return get_clock();
     default:
     case QEMU_CLOCK_VIRTUAL:
-        if (use_icount) {
-            return cpu_get_icount();
+        if (icount_enabled()) {
+            return icount_get();
         } else {
             return cpu_get_clock();
         }
--=20
2.16.4


From nobody Mon Feb  9 04:55:47 2026
Delivered-To: importer@patchew.org
Authentication-Results: mx.zohomail.com;
	spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as
 permitted sender)
  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
ARC-Seal: i=1; a=rsa-sha256; t=1590087409; cv=none;
	d=zohomail.com; s=zohoarc;
	b=nF7PQZtGVp4Og/6D4+17O2swMOmauixky/uoUT6LsAv0GQxX5w+ZAPT1APm38wnlzVmCnZr3UG19d8tOTo5ZCq6PZx2znD32WqnrZ/drRpjznIwEbRfHag7wajy7fmDwTw5as81bthTOoliEk4pxEqcVb6/rWEp/ctLCdECURP0=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com;
 s=zohoarc;
	t=1590087409;
 h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To;
	bh=l6atjsHJE8QS7cd8+REZO2FvWqHbovmDKJiCW8SE6aw=;
	b=hgFeOb83M9ct6inU/8CJYi3JjdWz1D7skXpj3WdPUPEFJzEXBBGqa5w21DsL8w8gdpPwQ9CsBN3Qqvy4v90HZYlDVHtVDnZneZggGGQXyHYpUZG4+RoaGbfyEy5hpcxFcnVgQYCNIXghmt9Yds0sjP9N0K5cUyGgXIz8qVZOUXc=
ARC-Authentication-Results: i=1; mx.zohomail.com;
	spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as
 permitted sender)
  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by
 mx.zohomail.com
	with SMTPS id 1590087409506544.5809930320231;
 Thu, 21 May 2020 11:56:49 -0700 (PDT)
Received: from localhost ([::1]:47126 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1jbqN5-0004Nw-Rk
	for importer@patchew.org; Thu, 21 May 2020 14:56:47 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:35316)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <cfontana@suse.de>) id 1jbqKh-00014Y-Uj
 for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:20 -0400
Received: from mx2.suse.de ([195.135.220.15]:49938)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <cfontana@suse.de>) id 1jbqKc-00042A-WE
 for qemu-devel@nongnu.org; Thu, 21 May 2020 14:54:19 -0400
Received: from relay2.suse.de (unknown [195.135.220.254])
 by mx2.suse.de (Postfix) with ESMTP id E4191B01C;
 Thu, 21 May 2020 18:54:15 +0000 (UTC)
X-Virus-Scanned: by amavisd-new at test-mx.suse.de
From: Claudio Fontana <cfontana@suse.de>
To: Paolo Bonzini <pbonzini@redhat.com>,
 =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>,
 Peter Maydell <peter.maydell@linaro.org>,
 =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= <philmd@redhat.com>
Subject: [RFC 3/3] cpus: implement cpus interfaces for per-accel threads
Date: Thu, 21 May 2020 20:54:07 +0200
Message-Id: <20200521185407.25311-4-cfontana@suse.de>
X-Mailer: git-send-email 2.16.4
In-Reply-To: <20200521185407.25311-1-cfontana@suse.de>
References: <20200521185407.25311-1-cfontana@suse.de>
Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17
 as permitted sender) client-ip=209.51.188.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Received-SPF: pass client-ip=195.135.220.15; envelope-from=cfontana@suse.de;
 helo=mx2.suse.de
X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/21 02:02:44
X-ACL-Warn: Detected OS   = Linux 2.2.x-3.x (no timestamps) [generic]
X-Spam_score_int: -41
X-Spam_score: -4.2
X-Spam_bar: ----
X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3,
 RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: Laurent Vivier <lvivier@redhat.com>, Thomas Huth <thuth@redhat.com>,
 Eduardo Habkost <ehabkost@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com>,
 "open list:All patches CC here" <qemu-devel@nongnu.org>,
 Roman Bolshakov <r.bolshakov@yadro.com>,
 Wenchao Wang <wenchao.wang@intel.com>,
 Colin Xu <colin.xu@intel.com>, Claudio Fontana <cfontana@suse.de>,
 "open list:X86 HAXM CPUs" <haxm-team@intel.com>,
 Sunil Muthuswamy <sunilmut@microsoft.com>,
 Richard Henderson <rth@twiddle.net>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Signed-off-by: Claudio Fontana <cfontana@suse.de>
---
 MAINTAINERS                          |   1 +
 accel/kvm/Makefile.objs              |   2 +
 accel/kvm/kvm-all.c                  |  15 +-
 accel/kvm/kvm-cpus-interface.c       |  94 ++++
 accel/kvm/kvm-cpus-interface.h       |   8 +
 accel/qtest.c                        |  82 ++++
 accel/stubs/kvm-stub.c               |   3 +-
 accel/tcg/Makefile.objs              |   1 +
 accel/tcg/tcg-all.c                  |  12 +-
 accel/tcg/tcg-cpus-interface.c       | 523 ++++++++++++++++++++
 accel/tcg/tcg-cpus-interface.h       |   8 +
 cpus.c                               | 911 +++----------------------------=
----
 hw/core/cpu.c                        |   1 +
 include/sysemu/cpus.h                |  44 ++
 include/sysemu/hvf.h                 |   1 -
 include/sysemu/hw_accel.h            |  57 +--
 include/sysemu/kvm.h                 |   2 +-
 stubs/Makefile.objs                  |   1 +
 stubs/cpu-synchronize-state.c        |  15 +
 target/i386/Makefile.objs            |   7 +-
 target/i386/hax-all.c                |   6 +-
 target/i386/hax-cpus-interface.c     |  85 ++++
 target/i386/hax-cpus-interface.h     |   8 +
 target/i386/hax-i386.h               |   2 +
 target/i386/hax-posix.c              |  12 +
 target/i386/hax-windows.c            |  20 +
 target/i386/hvf/Makefile.objs        |   2 +-
 target/i386/hvf/hvf-cpus-interface.c |  92 ++++
 target/i386/hvf/hvf-cpus-interface.h |   8 +
 target/i386/hvf/hvf.c                |   5 +-
 target/i386/whpx-all.c               |   3 +
 target/i386/whpx-cpus-interface.c    |  96 ++++
 target/i386/whpx-cpus-interface.h    |   8 +
 33 files changed, 1232 insertions(+), 903 deletions(-)
 create mode 100644 accel/kvm/kvm-cpus-interface.c
 create mode 100644 accel/kvm/kvm-cpus-interface.h
 create mode 100644 accel/tcg/tcg-cpus-interface.c
 create mode 100644 accel/tcg/tcg-cpus-interface.h
 create mode 100644 stubs/cpu-synchronize-state.c
 create mode 100644 target/i386/hax-cpus-interface.c
 create mode 100644 target/i386/hax-cpus-interface.h
 create mode 100644 target/i386/hvf/hvf-cpus-interface.c
 create mode 100644 target/i386/hvf/hvf-cpus-interface.h
 create mode 100644 target/i386/whpx-cpus-interface.c
 create mode 100644 target/i386/whpx-cpus-interface.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 1b3b17fda8..a256d5574c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -426,6 +426,7 @@ WHPX CPUs
 M: Sunil Muthuswamy <sunilmut@microsoft.com>
 S: Supported
 F: target/i386/whpx-all.c
+F: target/i386/whpx-cpus-interface.c
 F: target/i386/whp-dispatch.h
 F: accel/stubs/whpx-stub.c
 F: include/sysemu/whpx.h
diff --git a/accel/kvm/Makefile.objs b/accel/kvm/Makefile.objs
index fdfa481578..4babbf7796 100644
--- a/accel/kvm/Makefile.objs
+++ b/accel/kvm/Makefile.objs
@@ -1,2 +1,4 @@
 obj-y +=3D kvm-all.o
+obj-y +=3D kvm-cpus-interface.o
+
 obj-$(call lnot,$(CONFIG_SEV)) +=3D sev-stub.o
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index d06cc04079..c9cbbb1184 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -45,6 +45,10 @@
 #include "qapi/qapi-types-common.h"
 #include "qapi/qapi-visit-common.h"
 #include "sysemu/reset.h"
+#include "qemu/guest-random.h"
+
+#include "sysemu/hw_accel.h"
+#include "kvm-cpus-interface.h"
=20
 #include "hw/boards.h"
=20
@@ -329,7 +333,7 @@ err:
     return ret;
 }
=20
-int kvm_destroy_vcpu(CPUState *cpu)
+static int do_kvm_destroy_vcpu(CPUState *cpu)
 {
     KVMState *s =3D kvm_state;
     long mmap_size;
@@ -363,6 +367,14 @@ err:
     return ret;
 }
=20
+void kvm_destroy_vcpu(CPUState *cpu)
+{
+    if (do_kvm_destroy_vcpu(cpu) < 0) {
+        error_report("kvm_destroy_vcpu failed");
+        exit(EXIT_FAILURE);
+    }
+}
+
 static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
 {
     struct KVMParkedVcpu *cpu;
@@ -2146,6 +2158,7 @@ static int kvm_init(MachineState *ms)
         qemu_balloon_inhibit(true);
     }
=20
+    cpus_register_accel_interface(&kvm_cpus_interface);
     return 0;
=20
 err:
diff --git a/accel/kvm/kvm-cpus-interface.c b/accel/kvm/kvm-cpus-interface.c
new file mode 100644
index 0000000000..fd3d117364
--- /dev/null
+++ b/accel/kvm/kvm-cpus-interface.c
@@ -0,0 +1,94 @@
+/*
+ * QEMU KVM support
+ *
+ * Copyright IBM, Corp. 2008
+ *           Red Hat, Inc. 2008
+ *
+ * Authors:
+ *  Anthony Liguori   <aliguori@us.ibm.com>
+ *  Glauber Costa     <gcosta@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or late=
r.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qemu/main-loop.h"
+#include "sysemu/kvm_int.h"
+#include "sysemu/runstate.h"
+#include "sysemu/cpus.h"
+#include "qemu/guest-random.h"
+
+#include "kvm-cpus-interface.h"
+
+static void kvm_kick_vcpu_thread(CPUState *cpu)
+{
+    cpus_kick_thread(cpu);
+}
+
+static void *kvm_vcpu_thread_fn(void *arg)
+{
+    CPUState *cpu =3D arg;
+    int r;
+
+    rcu_register_thread();
+
+    qemu_mutex_lock_iothread();
+    qemu_thread_get_self(cpu->thread);
+    cpu->thread_id =3D qemu_get_thread_id();
+    cpu->can_do_io =3D 1;
+    current_cpu =3D cpu;
+
+    r =3D kvm_init_vcpu(cpu);
+    if (r < 0) {
+        error_report("kvm_init_vcpu failed: %s", strerror(-r));
+        exit(1);
+    }
+
+    kvm_init_cpu_signals(cpu);
+
+    /* signal CPU creation */
+    cpu_thread_signal_created(cpu);
+    qemu_guest_random_seed_thread_part2(cpu->random_seed);
+
+    do {
+        if (cpu_can_run(cpu)) {
+            r =3D kvm_cpu_exec(cpu);
+            if (r =3D=3D EXCP_DEBUG) {
+                cpu_handle_guest_debug(cpu);
+            }
+        }
+        qemu_wait_io_event(cpu);
+    } while (!cpu->unplug || cpu_can_run(cpu));
+
+    kvm_destroy_vcpu(cpu);
+    cpu_thread_signal_destroyed(cpu);
+    qemu_mutex_unlock_iothread();
+    rcu_unregister_thread();
+    return NULL;
+}
+
+static void kvm_start_vcpu_thread(CPUState *cpu)
+{
+    char thread_name[VCPU_THREAD_NAME_SIZE];
+
+    cpu->thread =3D g_malloc0(sizeof(QemuThread));
+    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
+    qemu_cond_init(cpu->halt_cond);
+    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
+             cpu->cpu_index);
+    qemu_thread_create(cpu->thread, thread_name, kvm_vcpu_thread_fn,
+                       cpu, QEMU_THREAD_JOINABLE);
+}
+
+CpusAccelInterface kvm_cpus_interface =3D {
+    .create_vcpu_thread =3D kvm_start_vcpu_thread,
+    .kick_vcpu_thread =3D kvm_kick_vcpu_thread,
+
+    .cpu_synchronize_post_reset =3D kvm_cpu_synchronize_post_reset,
+    .cpu_synchronize_post_init =3D kvm_cpu_synchronize_post_init,
+    .cpu_synchronize_state =3D kvm_cpu_synchronize_state,
+    .cpu_synchronize_pre_loadvm =3D kvm_cpu_synchronize_pre_loadvm,
+};
diff --git a/accel/kvm/kvm-cpus-interface.h b/accel/kvm/kvm-cpus-interface.h
new file mode 100644
index 0000000000..5531a4a4ad
--- /dev/null
+++ b/accel/kvm/kvm-cpus-interface.h
@@ -0,0 +1,8 @@
+#ifndef KVM_CPUS_INTERFACE_H
+#define KVM_CPUS_INTERFACE_H
+
+#include "sysemu/cpus.h"
+
+extern CpusAccelInterface kvm_cpus_interface;
+
+#endif /* KVM_CPUS_INTERFACE */
diff --git a/accel/qtest.c b/accel/qtest.c
index ef9ee0941a..1677c8724d 100644
--- a/accel/qtest.c
+++ b/accel/qtest.c
@@ -12,6 +12,7 @@
  */
=20
 #include "qemu/osdep.h"
+#include "qemu/rcu.h"
 #include "qapi/error.h"
 #include "qemu/module.h"
 #include "qemu/option.h"
@@ -20,6 +21,86 @@
 #include "sysemu/qtest.h"
 #include "sysemu/cpus.h"
 #include "sysemu/cpu-timers.h"
+#include "qemu/guest-random.h"
+#include "qemu/main-loop.h"
+#include "hw/core/cpu.h"
+
+static void qtest_cpu_synchronize_noop(CPUState *cpu)
+{
+}
+
+static void qtest_kick_vcpu_thread(CPUState *cpu)
+{
+    cpus_kick_thread(cpu);
+}
+
+static void *qtest_cpu_thread_fn(void *arg)
+{
+#ifdef _WIN32
+    error_report("qtest is not supported under Windows");
+    exit(1);
+#else
+    CPUState *cpu =3D arg;
+    sigset_t waitset;
+    int r;
+
+    rcu_register_thread();
+
+    qemu_mutex_lock_iothread();
+    qemu_thread_get_self(cpu->thread);
+    cpu->thread_id =3D qemu_get_thread_id();
+    cpu->can_do_io =3D 1;
+    current_cpu =3D cpu;
+
+    sigemptyset(&waitset);
+    sigaddset(&waitset, SIG_IPI);
+
+    /* signal CPU creation */
+    cpu_thread_signal_created(cpu);
+    qemu_guest_random_seed_thread_part2(cpu->random_seed);
+
+    do {
+        qemu_mutex_unlock_iothread();
+        do {
+            int sig;
+            r =3D sigwait(&waitset, &sig);
+        } while (r =3D=3D -1 && (errno =3D=3D EAGAIN || errno =3D=3D EINTR=
));
+        if (r =3D=3D -1) {
+            perror("sigwait");
+            exit(1);
+        }
+        qemu_mutex_lock_iothread();
+        qemu_wait_io_event(cpu);
+    } while (!cpu->unplug);
+
+    qemu_mutex_unlock_iothread();
+    rcu_unregister_thread();
+    return NULL;
+#endif
+}
+
+static void qtest_start_vcpu_thread(CPUState *cpu)
+{
+    char thread_name[VCPU_THREAD_NAME_SIZE];
+
+    cpu->thread =3D g_malloc0(sizeof(QemuThread));
+    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
+    qemu_cond_init(cpu->halt_cond);
+    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY",
+             cpu->cpu_index);
+    qemu_thread_create(cpu->thread, thread_name, qtest_cpu_thread_fn, cpu,
+                       QEMU_THREAD_JOINABLE);
+}
+
+CpusAccelInterface qtest_cpus_interface =3D {
+    .create_vcpu_thread =3D qtest_start_vcpu_thread,
+    .kick_vcpu_thread =3D qtest_kick_vcpu_thread,
+
+    .cpu_synchronize_post_reset =3D qtest_cpu_synchronize_noop,
+    .cpu_synchronize_post_init =3D qtest_cpu_synchronize_noop,
+    .cpu_synchronize_state =3D qtest_cpu_synchronize_noop,
+    .cpu_synchronize_pre_loadvm =3D qtest_cpu_synchronize_noop,
+};
=20
 static int qtest_init_accel(MachineState *ms)
 {
@@ -28,6 +109,7 @@ static int qtest_init_accel(MachineState *ms)
     qemu_opt_set(opts, "shift", "0", &error_abort);
     icount_configure(opts, &error_abort);
     qemu_opts_del(opts);
+    cpus_register_accel_interface(&qtest_cpus_interface);
     return 0;
 }
=20
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index 82f118d2df..69f8a842da 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -32,9 +32,8 @@ bool kvm_readonly_mem_allowed;
 bool kvm_ioeventfd_any_length_allowed;
 bool kvm_msi_use_devid;
=20
-int kvm_destroy_vcpu(CPUState *cpu)
+void kvm_destroy_vcpu(CPUState *cpu)
 {
-    return -ENOSYS;
 }
=20
 int kvm_init_vcpu(CPUState *cpu)
diff --git a/accel/tcg/Makefile.objs b/accel/tcg/Makefile.objs
index a92f2c454b..ddc57acae2 100644
--- a/accel/tcg/Makefile.objs
+++ b/accel/tcg/Makefile.objs
@@ -1,5 +1,6 @@
 obj-$(CONFIG_SOFTMMU) +=3D tcg-all.o
 obj-$(CONFIG_SOFTMMU) +=3D cputlb.o
+obj-$(CONFIG_SOFTMMU) +=3D tcg-cpus-interface.o
 obj-y +=3D tcg-runtime.o tcg-runtime-gvec.o
 obj-y +=3D cpu-exec.o cpu-exec-common.o translate-all.o
 obj-y +=3D translator.o
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index e27385d051..9e332585d3 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -24,19 +24,17 @@
  */
=20
 #include "qemu/osdep.h"
-#include "sysemu/accel.h"
+#include "qemu-common.h"
 #include "sysemu/tcg.h"
-#include "qom/object.h"
-#include "cpu.h"
-#include "sysemu/cpus.h"
 #include "sysemu/cpu-timers.h"
-#include "qemu/main-loop.h"
 #include "tcg/tcg.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "hw/boards.h"
 #include "qapi/qapi-builtin-visit.h"
=20
+#include "tcg-cpus-interface.h"
+
 typedef struct TCGState {
     AccelState parent_obj;
=20
@@ -123,6 +121,8 @@ static void tcg_accel_instance_init(Object *obj)
     s->mttcg_enabled =3D default_mttcg_enabled();
 }
=20
+bool mttcg_enabled;
+
 static int tcg_init(MachineState *ms)
 {
     TCGState *s =3D TCG_STATE(current_accel());
@@ -130,6 +130,8 @@ static int tcg_init(MachineState *ms)
     tcg_exec_init(s->tb_size * 1024 * 1024);
     cpu_interrupt_handler =3D tcg_handle_interrupt;
     mttcg_enabled =3D s->mttcg_enabled;
+    cpus_register_accel_interface(&tcg_cpus_interface);
+
     return 0;
 }
=20
diff --git a/accel/tcg/tcg-cpus-interface.c b/accel/tcg/tcg-cpus-interface.c
new file mode 100644
index 0000000000..28a88beb84
--- /dev/null
+++ b/accel/tcg/tcg-cpus-interface.c
@@ -0,0 +1,523 @@
+/*
+ * QEMU System Emulator, accelerator interfaces
+ *
+ * Copyright (c) 2003-2008 Fabrice Bellard
+ * Copyright (c) 2014 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a=
 copy
+ * of this software and associated documentation files (the "Software"), t=
o deal
+ * in the Software without restriction, including without limitation the r=
ights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or se=
ll
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included=
 in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS=
 OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OT=
HER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING=
 FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS =
IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "sysemu/tcg.h"
+#include "sysemu/replay.h"
+#include "qemu/main-loop.h"
+#include "qemu/guest-random.h"
+#include "exec/exec-all.h"
+
+#include "tcg-cpus-interface.h"
+
+/* Kick all RR vCPUs */
+static void qemu_cpu_kick_rr_cpus(void)
+{
+    CPUState *cpu;
+
+    CPU_FOREACH(cpu) {
+        cpu_exit(cpu);
+    };
+}
+
+static void tcg_kick_vcpu_thread(CPUState *cpu)
+{
+    if (qemu_tcg_mttcg_enabled()) {
+        cpu_exit(cpu);
+    } else {
+        qemu_cpu_kick_rr_cpus();
+    }
+}
+
+/*
+ * TCG vCPU kick timer
+ *
+ * The kick timer is responsible for moving single threaded vCPU
+ * emulation on to the next vCPU. If more than one vCPU is running a
+ * timer event with force a cpu->exit so the next vCPU can get
+ * scheduled.
+ *
+ * The timer is removed if all vCPUs are idle and restarted again once
+ * idleness is complete.
+ */
+
+static QEMUTimer *tcg_kick_vcpu_timer;
+static CPUState *tcg_current_rr_cpu;
+
+#define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10)
+
+static inline int64_t qemu_tcg_next_kick(void)
+{
+    return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD;
+}
+
+/* Kick the currently round-robin scheduled vCPU to next */
+static void qemu_cpu_kick_rr_next_cpu(void)
+{
+    CPUState *cpu;
+    do {
+        cpu =3D atomic_mb_read(&tcg_current_rr_cpu);
+        if (cpu) {
+            cpu_exit(cpu);
+        }
+    } while (cpu !=3D atomic_mb_read(&tcg_current_rr_cpu));
+}
+
+static void kick_tcg_thread(void *opaque)
+{
+    timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
+    qemu_cpu_kick_rr_next_cpu();
+}
+
+static void start_tcg_kick_timer(void)
+{
+    assert(!mttcg_enabled);
+    if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) {
+        tcg_kick_vcpu_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL,
+                                           kick_tcg_thread, NULL);
+    }
+    if (tcg_kick_vcpu_timer && !timer_pending(tcg_kick_vcpu_timer)) {
+        timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
+    }
+}
+
+static void stop_tcg_kick_timer(void)
+{
+    assert(!mttcg_enabled);
+    if (tcg_kick_vcpu_timer && timer_pending(tcg_kick_vcpu_timer)) {
+        timer_del(tcg_kick_vcpu_timer);
+    }
+}
+
+static void qemu_tcg_destroy_vcpu(CPUState *cpu)
+{
+}
+
+static void qemu_tcg_rr_wait_io_event(void)
+{
+    CPUState *cpu;
+
+    while (all_cpu_threads_idle()) {
+        stop_tcg_kick_timer();
+        qemu_cond_wait_iothread(first_cpu->halt_cond);
+    }
+
+    start_tcg_kick_timer();
+
+    CPU_FOREACH(cpu) {
+        qemu_wait_io_event_common(cpu);
+    }
+}
+
+static int64_t tcg_get_icount_limit(void)
+{
+    int64_t deadline;
+
+    if (replay_mode !=3D REPLAY_MODE_PLAY) {
+        /*
+         * Include all the timers, because they may need an attention.
+         * Too long CPU execution may create unnecessary delay in UI.
+         */
+        deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                              QEMU_TIMER_ATTR_ALL);
+        /* Check realtime timers, because they help with input processing =
*/
+        deadline =3D qemu_soonest_timeout(deadline,
+                qemu_clock_deadline_ns_all(QEMU_CLOCK_REALTIME,
+                                           QEMU_TIMER_ATTR_ALL));
+
+        /*
+         * Maintain prior (possibly buggy) behaviour where if no deadline
+         * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more=
 than
+         * INT32_MAX nanoseconds ahead, we still use INT32_MAX
+         * nanoseconds.
+         */
+        if ((deadline < 0) || (deadline > INT32_MAX)) {
+            deadline =3D INT32_MAX;
+        }
+
+        return icount_round(deadline);
+    } else {
+        return replay_get_instructions();
+    }
+}
+
+static void handle_icount_deadline(void)
+{
+    assert(qemu_in_vcpu_thread());
+    if (icount_enabled()) {
+        int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                                      QEMU_TIMER_ATTR_ALL);
+
+        if (deadline =3D=3D 0) {
+            /* Wake up other AioContexts.  */
+            qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+            qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
+        }
+    }
+}
+
+static void prepare_icount_for_run(CPUState *cpu)
+{
+    if (icount_enabled()) {
+        int insns_left;
+
+        /*
+         * These should always be cleared by process_icount_data after
+         * each vCPU execution. However u16.high can be raised
+         * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt
+         */
+        g_assert(cpu_neg(cpu)->icount_decr.u16.low =3D=3D 0);
+        g_assert(cpu->icount_extra =3D=3D 0);
+
+        cpu->icount_budget =3D tcg_get_icount_limit();
+        insns_left =3D MIN(0xffff, cpu->icount_budget);
+        cpu_neg(cpu)->icount_decr.u16.low =3D insns_left;
+        cpu->icount_extra =3D cpu->icount_budget - insns_left;
+
+        replay_mutex_lock();
+    }
+}
+
+static void process_icount_data(CPUState *cpu)
+{
+    if (icount_enabled()) {
+        /* Account for executed instructions */
+        icount_update(cpu);
+
+        /* Reset the counters */
+        cpu_neg(cpu)->icount_decr.u16.low =3D 0;
+        cpu->icount_extra =3D 0;
+        cpu->icount_budget =3D 0;
+
+        replay_account_executed_instructions();
+
+        replay_mutex_unlock();
+    }
+}
+
+static int tcg_cpu_exec(CPUState *cpu)
+{
+    int ret;
+#ifdef CONFIG_PROFILER
+    int64_t ti;
+#endif
+
+    assert(tcg_enabled());
+#ifdef CONFIG_PROFILER
+    ti =3D profile_getclock();
+#endif
+    cpu_exec_start(cpu);
+    ret =3D cpu_exec(cpu);
+    cpu_exec_end(cpu);
+#ifdef CONFIG_PROFILER
+    atomic_set(&tcg_ctx->prof.cpu_exec_time,
+               tcg_ctx->prof.cpu_exec_time + profile_getclock() - ti);
+#endif
+    return ret;
+}
+
+/*
+ * Destroy any remaining vCPUs which have been unplugged and have
+ * finished running
+ */
+static void deal_with_unplugged_cpus(void)
+{
+    CPUState *cpu;
+
+    CPU_FOREACH(cpu) {
+        if (cpu->unplug && !cpu_can_run(cpu)) {
+            qemu_tcg_destroy_vcpu(cpu);
+            cpu_thread_signal_destroyed(cpu);
+            break;
+        }
+    }
+}
+
+/*
+ * Single-threaded TCG
+ *
+ * In the single-threaded case each vCPU is simulated in turn. If
+ * there is more than a single vCPU we create a simple timer to kick
+ * the vCPU and ensure we don't get stuck in a tight loop in one vCPU.
+ * This is done explicitly rather than relying on side-effects
+ * elsewhere.
+ */
+
+static void *tcg_rr_cpu_thread_fn(void *arg)
+{
+    CPUState *cpu =3D arg;
+
+    assert(tcg_enabled());
+    rcu_register_thread();
+    tcg_register_thread();
+
+    qemu_mutex_lock_iothread();
+    qemu_thread_get_self(cpu->thread);
+
+    cpu->thread_id =3D qemu_get_thread_id();
+    cpu->can_do_io =3D 1;
+    cpu_thread_signal_created(cpu);
+    qemu_guest_random_seed_thread_part2(cpu->random_seed);
+
+    /* wait for initial kick-off after machine start */
+    while (first_cpu->stopped) {
+        qemu_cond_wait_iothread(first_cpu->halt_cond);
+
+        /* process any pending work */
+        CPU_FOREACH(cpu) {
+            current_cpu =3D cpu;
+            qemu_wait_io_event_common(cpu);
+        }
+    }
+
+    start_tcg_kick_timer();
+
+    cpu =3D first_cpu;
+
+    /* process any pending work */
+    cpu->exit_request =3D 1;
+
+    while (1) {
+        qemu_mutex_unlock_iothread();
+        replay_mutex_lock();
+        qemu_mutex_lock_iothread();
+        /* Account partial waits to QEMU_CLOCK_VIRTUAL.  */
+        icount_account_warp_timer();
+
+        /*
+         * Run the timers here.  This is much more efficient than
+         * waking up the I/O thread and waiting for completion.
+         */
+        handle_icount_deadline();
+
+        replay_mutex_unlock();
+
+        if (!cpu) {
+            cpu =3D first_cpu;
+        }
+
+        while (cpu && !cpu->queued_work_first && !cpu->exit_request) {
+
+            atomic_mb_set(&tcg_current_rr_cpu, cpu);
+            current_cpu =3D cpu;
+
+            qemu_clock_enable(QEMU_CLOCK_VIRTUAL,
+                              (cpu->singlestep_enabled & SSTEP_NOTIMER) =
=3D=3D 0);
+
+            if (cpu_can_run(cpu)) {
+                int r;
+
+                qemu_mutex_unlock_iothread();
+                prepare_icount_for_run(cpu);
+
+                r =3D tcg_cpu_exec(cpu);
+
+                process_icount_data(cpu);
+                qemu_mutex_lock_iothread();
+
+                if (r =3D=3D EXCP_DEBUG) {
+                    cpu_handle_guest_debug(cpu);
+                    break;
+                } else if (r =3D=3D EXCP_ATOMIC) {
+                    qemu_mutex_unlock_iothread();
+                    cpu_exec_step_atomic(cpu);
+                    qemu_mutex_lock_iothread();
+                    break;
+                }
+            } else if (cpu->stop) {
+                if (cpu->unplug) {
+                    cpu =3D CPU_NEXT(cpu);
+                }
+                break;
+            }
+
+            cpu =3D CPU_NEXT(cpu);
+        } /* while (cpu && !cpu->exit_request).. */
+
+        /* Does not need atomic_mb_set because a spurious wakeup is okay. =
 */
+        atomic_set(&tcg_current_rr_cpu, NULL);
+
+        if (cpu && cpu->exit_request) {
+            atomic_mb_set(&cpu->exit_request, 0);
+        }
+
+        if (icount_enabled() && all_cpu_threads_idle()) {
+            /*
+             * When all cpus are sleeping (e.g in WFI), to avoid a deadlock
+             * in the main_loop, wake it up in order to start the warp tim=
er.
+             */
+            qemu_notify_event();
+        }
+
+        qemu_tcg_rr_wait_io_event();
+        deal_with_unplugged_cpus();
+    }
+
+    rcu_unregister_thread();
+    return NULL;
+}
+
+/*
+ * Multi-threaded TCG
+ *
+ * In the multi-threaded case each vCPU has its own thread. The TLS
+ * variable current_cpu can be used deep in the code to find the
+ * current CPUState for a given thread.
+ */
+
+static void *tcg_cpu_thread_fn(void *arg)
+{
+    CPUState *cpu =3D arg;
+
+    assert(tcg_enabled());
+    g_assert(!icount_enabled());
+
+    rcu_register_thread();
+    tcg_register_thread();
+
+    qemu_mutex_lock_iothread();
+    qemu_thread_get_self(cpu->thread);
+
+    cpu->thread_id =3D qemu_get_thread_id();
+    cpu->can_do_io =3D 1;
+    current_cpu =3D cpu;
+    cpu_thread_signal_created(cpu);
+    qemu_guest_random_seed_thread_part2(cpu->random_seed);
+
+    /* process any pending work */
+    cpu->exit_request =3D 1;
+
+    do {
+        if (cpu_can_run(cpu)) {
+            int r;
+            qemu_mutex_unlock_iothread();
+            r =3D tcg_cpu_exec(cpu);
+            qemu_mutex_lock_iothread();
+            switch (r) {
+            case EXCP_DEBUG:
+                cpu_handle_guest_debug(cpu);
+                break;
+            case EXCP_HALTED:
+                /*
+                 * during start-up the vCPU is reset and the thread is
+                 * kicked several times. If we don't ensure we go back
+                 * to sleep in the halted state we won't cleanly
+                 * start-up when the vCPU is enabled.
+                 *
+                 * cpu->halted should ensure we sleep in wait_io_event
+                 */
+                g_assert(cpu->halted);
+                break;
+            case EXCP_ATOMIC:
+                qemu_mutex_unlock_iothread();
+                cpu_exec_step_atomic(cpu);
+                qemu_mutex_lock_iothread();
+            default:
+                /* Ignore everything else? */
+                break;
+            }
+        }
+
+        atomic_mb_set(&cpu->exit_request, 0);
+        qemu_wait_io_event(cpu);
+    } while (!cpu->unplug || cpu_can_run(cpu));
+
+    qemu_tcg_destroy_vcpu(cpu);
+    cpu_thread_signal_destroyed(cpu);
+    qemu_mutex_unlock_iothread();
+    rcu_unregister_thread();
+    return NULL;
+}
+
+static void tcg_start_vcpu_thread(CPUState *cpu)
+{
+    char thread_name[VCPU_THREAD_NAME_SIZE];
+    static QemuCond *single_tcg_halt_cond;
+    static QemuThread *single_tcg_cpu_thread;
+    static int tcg_region_inited;
+
+    assert(tcg_enabled());
+    /*
+     * Initialize TCG regions--once. Now is a good time, because:
+     * (1) TCG's init context, prologue and target globals have been set u=
p.
+     * (2) qemu_tcg_mttcg_enabled() works now (TCG init code runs before t=
he
+     *     -accel flag is processed, so the check doesn't work then).
+     */
+    if (!tcg_region_inited) {
+        tcg_region_inited =3D 1;
+        tcg_region_init();
+    }
+
+    if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
+        cpu->thread =3D g_malloc0(sizeof(QemuThread));
+        cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
+        qemu_cond_init(cpu->halt_cond);
+
+        if (qemu_tcg_mttcg_enabled()) {
+            /* create a thread per vCPU with TCG (MTTCG) */
+            parallel_cpus =3D true;
+            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
+                 cpu->cpu_index);
+
+            qemu_thread_create(cpu->thread, thread_name, tcg_cpu_thread_fn,
+                               cpu, QEMU_THREAD_JOINABLE);
+
+        } else {
+            /* share a single thread for all cpus with TCG */
+            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
+            qemu_thread_create(cpu->thread, thread_name,
+                               tcg_rr_cpu_thread_fn,
+                               cpu, QEMU_THREAD_JOINABLE);
+
+            single_tcg_halt_cond =3D cpu->halt_cond;
+            single_tcg_cpu_thread =3D cpu->thread;
+        }
+#ifdef _WIN32
+        cpu->hThread =3D qemu_thread_get_handle(cpu->thread);
+#endif
+    } else {
+        /* For non-MTTCG cases we share the thread */
+        cpu->thread =3D single_tcg_cpu_thread;
+        cpu->halt_cond =3D single_tcg_halt_cond;
+        cpu->thread_id =3D first_cpu->thread_id;
+        cpu->can_do_io =3D 1;
+        cpu->created =3D true;
+    }
+}
+
+static void tcg_cpu_synchronize_noop(CPUState *cpu)
+{
+}
+
+CpusAccelInterface tcg_cpus_interface =3D {
+    .create_vcpu_thread =3D tcg_start_vcpu_thread,
+    .kick_vcpu_thread =3D tcg_kick_vcpu_thread,
+
+    .cpu_synchronize_post_reset =3D tcg_cpu_synchronize_noop,
+    .cpu_synchronize_post_init =3D tcg_cpu_synchronize_noop,
+    .cpu_synchronize_state =3D tcg_cpu_synchronize_noop,
+    .cpu_synchronize_pre_loadvm =3D tcg_cpu_synchronize_noop,
+};
diff --git a/accel/tcg/tcg-cpus-interface.h b/accel/tcg/tcg-cpus-interface.h
new file mode 100644
index 0000000000..c6e96b2af4
--- /dev/null
+++ b/accel/tcg/tcg-cpus-interface.h
@@ -0,0 +1,8 @@
+#ifndef TCG_CPUS_INTERFACE_H
+#define TCG_CPUS_INTERFACE_H
+
+#include "sysemu/cpus.h"
+
+extern CpusAccelInterface tcg_cpus_interface;
+
+#endif /* TCG_CPUS_INTERFACE */
diff --git a/cpus.c b/cpus.c
index 7e9f545be8..3f5d34981e 100644
--- a/cpus.c
+++ b/cpus.c
@@ -24,27 +24,19 @@
=20
 #include "qemu/osdep.h"
 #include "qemu-common.h"
-#include "qemu/config-file.h"
-#include "qemu/cutils.h"
-#include "migration/vmstate.h"
 #include "monitor/monitor.h"
 #include "qapi/error.h"
 #include "qapi/qapi-commands-misc.h"
 #include "qapi/qapi-events-run-state.h"
 #include "qapi/qmp/qerror.h"
-#include "qemu/error-report.h"
-#include "qemu/qemu-print.h"
 #include "sysemu/tcg.h"
-#include "sysemu/block-backend.h"
 #include "exec/gdbstub.h"
-#include "sysemu/dma.h"
 #include "sysemu/hw_accel.h"
 #include "sysemu/kvm.h"
 #include "sysemu/hax.h"
 #include "sysemu/hvf.h"
 #include "sysemu/whpx.h"
 #include "exec/exec-all.h"
-
 #include "qemu/thread.h"
 #include "qemu/plugin.h"
 #include "sysemu/cpus.h"
@@ -87,7 +79,7 @@ bool cpu_is_stopped(CPUState *cpu)
     return cpu->stopped || !runstate_is_running();
 }
=20
-static bool cpu_thread_is_idle(CPUState *cpu)
+bool cpu_thread_is_idle(CPUState *cpu)
 {
     if (cpu->stop || cpu->queued_work_first) {
         return false;
@@ -114,78 +106,6 @@ bool all_cpu_threads_idle(void)
     return true;
 }
=20
-bool mttcg_enabled;
-
-/***********************************************************/
-/* TCG vCPU kick timer
- *
- * The kick timer is responsible for moving single threaded vCPU
- * emulation on to the next vCPU. If more than one vCPU is running a
- * timer event with force a cpu->exit so the next vCPU can get
- * scheduled.
- *
- * The timer is removed if all vCPUs are idle and restarted again once
- * idleness is complete.
- */
-
-static QEMUTimer *tcg_kick_vcpu_timer;
-static CPUState *tcg_current_rr_cpu;
-
-#define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10)
-
-static inline int64_t qemu_tcg_next_kick(void)
-{
-    return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD;
-}
-
-/* Kick the currently round-robin scheduled vCPU to next */
-static void qemu_cpu_kick_rr_next_cpu(void)
-{
-    CPUState *cpu;
-    do {
-        cpu =3D atomic_mb_read(&tcg_current_rr_cpu);
-        if (cpu) {
-            cpu_exit(cpu);
-        }
-    } while (cpu !=3D atomic_mb_read(&tcg_current_rr_cpu));
-}
-
-/* Kick all RR vCPUs */
-static void qemu_cpu_kick_rr_cpus(void)
-{
-    CPUState *cpu;
-
-    CPU_FOREACH(cpu) {
-        cpu_exit(cpu);
-    };
-}
-
-static void kick_tcg_thread(void *opaque)
-{
-    timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
-    qemu_cpu_kick_rr_next_cpu();
-}
-
-static void start_tcg_kick_timer(void)
-{
-    assert(!mttcg_enabled);
-    if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) {
-        tcg_kick_vcpu_timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL,
-                                           kick_tcg_thread, NULL);
-    }
-    if (tcg_kick_vcpu_timer && !timer_pending(tcg_kick_vcpu_timer)) {
-        timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
-    }
-}
-
-static void stop_tcg_kick_timer(void)
-{
-    assert(!mttcg_enabled);
-    if (tcg_kick_vcpu_timer && timer_pending(tcg_kick_vcpu_timer)) {
-        timer_del(tcg_kick_vcpu_timer);
-    }
-}
-
 /***********************************************************/
 void hw_error(const char *fmt, ...)
 {
@@ -204,16 +124,21 @@ void hw_error(const char *fmt, ...)
     abort();
 }
=20
+/*
+ * every accelerator is supposed to register this.
+ * Cannot be done cleanly as a machine state or accel class method,
+ * since TCG is not a normal accelerator yet,
+ * with USER mode being special-cased and other complications.
+ */
+static CpusAccelInterface accel_int;
+
 void cpu_synchronize_all_states(void)
 {
     CPUState *cpu;
=20
     CPU_FOREACH(cpu) {
-        cpu_synchronize_state(cpu);
-        /* TODO: move to cpu_synchronize_state() */
-        if (hvf_enabled()) {
-            hvf_cpu_synchronize_state(cpu);
-        }
+        assert(accel_int.cpu_synchronize_state !=3D NULL);
+        accel_int.cpu_synchronize_state(cpu);
     }
 }
=20
@@ -222,11 +147,8 @@ void cpu_synchronize_all_post_reset(void)
     CPUState *cpu;
=20
     CPU_FOREACH(cpu) {
-        cpu_synchronize_post_reset(cpu);
-        /* TODO: move to cpu_synchronize_post_reset() */
-        if (hvf_enabled()) {
-            hvf_cpu_synchronize_post_reset(cpu);
-        }
+        assert(accel_int.cpu_synchronize_post_reset !=3D NULL);
+        accel_int.cpu_synchronize_post_reset(cpu);
     }
 }
=20
@@ -235,11 +157,8 @@ void cpu_synchronize_all_post_init(void)
     CPUState *cpu;
=20
     CPU_FOREACH(cpu) {
-        cpu_synchronize_post_init(cpu);
-        /* TODO: move to cpu_synchronize_post_init() */
-        if (hvf_enabled()) {
-            hvf_cpu_synchronize_post_init(cpu);
-        }
+        assert(accel_int.cpu_synchronize_post_init !=3D NULL);
+        accel_int.cpu_synchronize_post_init(cpu);
     }
 }
=20
@@ -248,7 +167,8 @@ void cpu_synchronize_all_pre_loadvm(void)
     CPUState *cpu;
=20
     CPU_FOREACH(cpu) {
-        cpu_synchronize_pre_loadvm(cpu);
+        assert(accel_int.cpu_synchronize_pre_loadvm !=3D NULL);
+        accel_int.cpu_synchronize_pre_loadvm(cpu);
     }
 }
=20
@@ -280,7 +200,7 @@ int vm_shutdown(void)
     return do_vm_stop(RUN_STATE_SHUTDOWN, false);
 }
=20
-static bool cpu_can_run(CPUState *cpu)
+bool cpu_can_run(CPUState *cpu)
 {
     if (cpu->stop) {
         return false;
@@ -291,7 +211,7 @@ static bool cpu_can_run(CPUState *cpu)
     return true;
 }
=20
-static void cpu_handle_guest_debug(CPUState *cpu)
+void cpu_handle_guest_debug(CPUState *cpu)
 {
     gdb_set_stop_cpu(cpu);
     qemu_system_debug_request();
@@ -374,18 +294,6 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, r=
un_on_cpu_data data)
     do_run_on_cpu(cpu, func, data, &qemu_global_mutex);
 }
=20
-static void qemu_kvm_destroy_vcpu(CPUState *cpu)
-{
-    if (kvm_destroy_vcpu(cpu) < 0) {
-        error_report("kvm_destroy_vcpu failed");
-        exit(EXIT_FAILURE);
-    }
-}
-
-static void qemu_tcg_destroy_vcpu(CPUState *cpu)
-{
-}
-
 static void qemu_cpu_stop(CPUState *cpu, bool exit)
 {
     g_assert(qemu_cpu_is_self(cpu));
@@ -397,7 +305,7 @@ static void qemu_cpu_stop(CPUState *cpu, bool exit)
     qemu_cond_broadcast(&qemu_pause_cond);
 }
=20
-static void qemu_wait_io_event_common(CPUState *cpu)
+void qemu_wait_io_event_common(CPUState *cpu)
 {
     atomic_mb_set(&cpu->thread_kicked, false);
     if (cpu->stop) {
@@ -406,23 +314,7 @@ static void qemu_wait_io_event_common(CPUState *cpu)
     process_queued_cpu_work(cpu);
 }
=20
-static void qemu_tcg_rr_wait_io_event(void)
-{
-    CPUState *cpu;
-
-    while (all_cpu_threads_idle()) {
-        stop_tcg_kick_timer();
-        qemu_cond_wait(first_cpu->halt_cond, &qemu_global_mutex);
-    }
-
-    start_tcg_kick_timer();
-
-    CPU_FOREACH(cpu) {
-        qemu_wait_io_event_common(cpu);
-    }
-}
-
-static void qemu_wait_io_event(CPUState *cpu)
+void qemu_wait_io_event(CPUState *cpu)
 {
     bool slept =3D false;
=20
@@ -438,7 +330,8 @@ static void qemu_wait_io_event(CPUState *cpu)
     }
=20
 #ifdef _WIN32
-    /* Eat dummy APC queued by qemu_cpu_kick_thread.  */
+    /* Eat dummy APC queued by hax_kick_vcpu_thread */
+    /* NB!!! Should not this be if (hax_enabled)? Is this wrong for whpx? =
*/
     if (!tcg_enabled()) {
         SleepEx(0, TRUE);
     }
@@ -446,540 +339,7 @@ static void qemu_wait_io_event(CPUState *cpu)
     qemu_wait_io_event_common(cpu);
 }
=20
-static void *qemu_kvm_cpu_thread_fn(void *arg)
-{
-    CPUState *cpu =3D arg;
-    int r;
-
-    rcu_register_thread();
-
-    qemu_mutex_lock_iothread();
-    qemu_thread_get_self(cpu->thread);
-    cpu->thread_id =3D qemu_get_thread_id();
-    cpu->can_do_io =3D 1;
-    current_cpu =3D cpu;
-
-    r =3D kvm_init_vcpu(cpu);
-    if (r < 0) {
-        error_report("kvm_init_vcpu failed: %s", strerror(-r));
-        exit(1);
-    }
-
-    kvm_init_cpu_signals(cpu);
-
-    /* signal CPU creation */
-    cpu->created =3D true;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_guest_random_seed_thread_part2(cpu->random_seed);
-
-    do {
-        if (cpu_can_run(cpu)) {
-            r =3D kvm_cpu_exec(cpu);
-            if (r =3D=3D EXCP_DEBUG) {
-                cpu_handle_guest_debug(cpu);
-            }
-        }
-        qemu_wait_io_event(cpu);
-    } while (!cpu->unplug || cpu_can_run(cpu));
-
-    qemu_kvm_destroy_vcpu(cpu);
-    cpu->created =3D false;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_mutex_unlock_iothread();
-    rcu_unregister_thread();
-    return NULL;
-}
-
-static void *qemu_dummy_cpu_thread_fn(void *arg)
-{
-#ifdef _WIN32
-    error_report("qtest is not supported under Windows");
-    exit(1);
-#else
-    CPUState *cpu =3D arg;
-    sigset_t waitset;
-    int r;
-
-    rcu_register_thread();
-
-    qemu_mutex_lock_iothread();
-    qemu_thread_get_self(cpu->thread);
-    cpu->thread_id =3D qemu_get_thread_id();
-    cpu->can_do_io =3D 1;
-    current_cpu =3D cpu;
-
-    sigemptyset(&waitset);
-    sigaddset(&waitset, SIG_IPI);
-
-    /* signal CPU creation */
-    cpu->created =3D true;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_guest_random_seed_thread_part2(cpu->random_seed);
-
-    do {
-        qemu_mutex_unlock_iothread();
-        do {
-            int sig;
-            r =3D sigwait(&waitset, &sig);
-        } while (r =3D=3D -1 && (errno =3D=3D EAGAIN || errno =3D=3D EINTR=
));
-        if (r =3D=3D -1) {
-            perror("sigwait");
-            exit(1);
-        }
-        qemu_mutex_lock_iothread();
-        qemu_wait_io_event(cpu);
-    } while (!cpu->unplug);
-
-    qemu_mutex_unlock_iothread();
-    rcu_unregister_thread();
-    return NULL;
-#endif
-}
-
-static int64_t tcg_get_icount_limit(void)
-{
-    int64_t deadline;
-
-    if (replay_mode !=3D REPLAY_MODE_PLAY) {
-        /*
-         * Include all the timers, because they may need an attention.
-         * Too long CPU execution may create unnecessary delay in UI.
-         */
-        deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
-                                              QEMU_TIMER_ATTR_ALL);
-        /* Check realtime timers, because they help with input processing =
*/
-        deadline =3D qemu_soonest_timeout(deadline,
-                qemu_clock_deadline_ns_all(QEMU_CLOCK_REALTIME,
-                                           QEMU_TIMER_ATTR_ALL));
-
-        /* Maintain prior (possibly buggy) behaviour where if no deadline
-         * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more=
 than
-         * INT32_MAX nanoseconds ahead, we still use INT32_MAX
-         * nanoseconds.
-         */
-        if ((deadline < 0) || (deadline > INT32_MAX)) {
-            deadline =3D INT32_MAX;
-        }
-
-        return icount_round(deadline);
-    } else {
-        return replay_get_instructions();
-    }
-}
-
-static void handle_icount_deadline(void)
-{
-    assert(qemu_in_vcpu_thread());
-    if (icount_enabled()) {
-        int64_t deadline =3D qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
-                                                      QEMU_TIMER_ATTR_ALL);
-
-        if (deadline =3D=3D 0) {
-            /* Wake up other AioContexts.  */
-            qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
-            qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
-        }
-    }
-}
-
-static void prepare_icount_for_run(CPUState *cpu)
-{
-    if (icount_enabled()) {
-        int insns_left;
-
-        /* These should always be cleared by process_icount_data after
-         * each vCPU execution. However u16.high can be raised
-         * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt
-         */
-        g_assert(cpu_neg(cpu)->icount_decr.u16.low =3D=3D 0);
-        g_assert(cpu->icount_extra =3D=3D 0);
-
-        cpu->icount_budget =3D tcg_get_icount_limit();
-        insns_left =3D MIN(0xffff, cpu->icount_budget);
-        cpu_neg(cpu)->icount_decr.u16.low =3D insns_left;
-        cpu->icount_extra =3D cpu->icount_budget - insns_left;
-
-        replay_mutex_lock();
-    }
-}
-
-static void process_icount_data(CPUState *cpu)
-{
-    if (icount_enabled()) {
-        /* Account for executed instructions */
-        icount_update(cpu);
-
-        /* Reset the counters */
-        cpu_neg(cpu)->icount_decr.u16.low =3D 0;
-        cpu->icount_extra =3D 0;
-        cpu->icount_budget =3D 0;
-
-        replay_account_executed_instructions();
-
-        replay_mutex_unlock();
-    }
-}
-
-
-static int tcg_cpu_exec(CPUState *cpu)
-{
-    int ret;
-#ifdef CONFIG_PROFILER
-    int64_t ti;
-#endif
-
-    assert(tcg_enabled());
-#ifdef CONFIG_PROFILER
-    ti =3D profile_getclock();
-#endif
-    cpu_exec_start(cpu);
-    ret =3D cpu_exec(cpu);
-    cpu_exec_end(cpu);
-#ifdef CONFIG_PROFILER
-    atomic_set(&tcg_ctx->prof.cpu_exec_time,
-               tcg_ctx->prof.cpu_exec_time + profile_getclock() - ti);
-#endif
-    return ret;
-}
-
-/* Destroy any remaining vCPUs which have been unplugged and have
- * finished running
- */
-static void deal_with_unplugged_cpus(void)
-{
-    CPUState *cpu;
-
-    CPU_FOREACH(cpu) {
-        if (cpu->unplug && !cpu_can_run(cpu)) {
-            qemu_tcg_destroy_vcpu(cpu);
-            cpu->created =3D false;
-            qemu_cond_signal(&qemu_cpu_cond);
-            break;
-        }
-    }
-}
-
-/* Single-threaded TCG
- *
- * In the single-threaded case each vCPU is simulated in turn. If
- * there is more than a single vCPU we create a simple timer to kick
- * the vCPU and ensure we don't get stuck in a tight loop in one vCPU.
- * This is done explicitly rather than relying on side-effects
- * elsewhere.
- */
-
-static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
-{
-    CPUState *cpu =3D arg;
-
-    assert(tcg_enabled());
-    rcu_register_thread();
-    tcg_register_thread();
-
-    qemu_mutex_lock_iothread();
-    qemu_thread_get_self(cpu->thread);
-
-    cpu->thread_id =3D qemu_get_thread_id();
-    cpu->created =3D true;
-    cpu->can_do_io =3D 1;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_guest_random_seed_thread_part2(cpu->random_seed);
-
-    /* wait for initial kick-off after machine start */
-    while (first_cpu->stopped) {
-        qemu_cond_wait(first_cpu->halt_cond, &qemu_global_mutex);
-
-        /* process any pending work */
-        CPU_FOREACH(cpu) {
-            current_cpu =3D cpu;
-            qemu_wait_io_event_common(cpu);
-        }
-    }
-
-    start_tcg_kick_timer();
-
-    cpu =3D first_cpu;
-
-    /* process any pending work */
-    cpu->exit_request =3D 1;
-
-    while (1) {
-        qemu_mutex_unlock_iothread();
-        replay_mutex_lock();
-        qemu_mutex_lock_iothread();
-        /* Account partial waits to QEMU_CLOCK_VIRTUAL.  */
-        icount_account_warp_timer();
-
-        /* Run the timers here.  This is much more efficient than
-         * waking up the I/O thread and waiting for completion.
-         */
-        handle_icount_deadline();
-
-        replay_mutex_unlock();
-
-        if (!cpu) {
-            cpu =3D first_cpu;
-        }
-
-        while (cpu && !cpu->queued_work_first && !cpu->exit_request) {
-
-            atomic_mb_set(&tcg_current_rr_cpu, cpu);
-            current_cpu =3D cpu;
-
-            qemu_clock_enable(QEMU_CLOCK_VIRTUAL,
-                              (cpu->singlestep_enabled & SSTEP_NOTIMER) =
=3D=3D 0);
-
-            if (cpu_can_run(cpu)) {
-                int r;
-
-                qemu_mutex_unlock_iothread();
-                prepare_icount_for_run(cpu);
-
-                r =3D tcg_cpu_exec(cpu);
-
-                process_icount_data(cpu);
-                qemu_mutex_lock_iothread();
-
-                if (r =3D=3D EXCP_DEBUG) {
-                    cpu_handle_guest_debug(cpu);
-                    break;
-                } else if (r =3D=3D EXCP_ATOMIC) {
-                    qemu_mutex_unlock_iothread();
-                    cpu_exec_step_atomic(cpu);
-                    qemu_mutex_lock_iothread();
-                    break;
-                }
-            } else if (cpu->stop) {
-                if (cpu->unplug) {
-                    cpu =3D CPU_NEXT(cpu);
-                }
-                break;
-            }
-
-            cpu =3D CPU_NEXT(cpu);
-        } /* while (cpu && !cpu->exit_request).. */
-
-        /* Does not need atomic_mb_set because a spurious wakeup is okay. =
 */
-        atomic_set(&tcg_current_rr_cpu, NULL);
-
-        if (cpu && cpu->exit_request) {
-            atomic_mb_set(&cpu->exit_request, 0);
-        }
-
-        if (icount_enabled() && all_cpu_threads_idle()) {
-            /*
-             * When all cpus are sleeping (e.g in WFI), to avoid a deadlock
-             * in the main_loop, wake it up in order to start the warp tim=
er.
-             */
-            qemu_notify_event();
-        }
-
-        qemu_tcg_rr_wait_io_event();
-        deal_with_unplugged_cpus();
-    }
-
-    rcu_unregister_thread();
-    return NULL;
-}
-
-static void *qemu_hax_cpu_thread_fn(void *arg)
-{
-    CPUState *cpu =3D arg;
-    int r;
-
-    rcu_register_thread();
-    qemu_mutex_lock_iothread();
-    qemu_thread_get_self(cpu->thread);
-
-    cpu->thread_id =3D qemu_get_thread_id();
-    cpu->created =3D true;
-    current_cpu =3D cpu;
-
-    hax_init_vcpu(cpu);
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_guest_random_seed_thread_part2(cpu->random_seed);
-
-    do {
-        if (cpu_can_run(cpu)) {
-            r =3D hax_smp_cpu_exec(cpu);
-            if (r =3D=3D EXCP_DEBUG) {
-                cpu_handle_guest_debug(cpu);
-            }
-        }
-
-        qemu_wait_io_event(cpu);
-    } while (!cpu->unplug || cpu_can_run(cpu));
-    rcu_unregister_thread();
-    return NULL;
-}
-
-/* The HVF-specific vCPU thread function. This one should only run when th=
e host
- * CPU supports the VMX "unrestricted guest" feature. */
-static void *qemu_hvf_cpu_thread_fn(void *arg)
-{
-    CPUState *cpu =3D arg;
-
-    int r;
-
-    assert(hvf_enabled());
-
-    rcu_register_thread();
-
-    qemu_mutex_lock_iothread();
-    qemu_thread_get_self(cpu->thread);
-
-    cpu->thread_id =3D qemu_get_thread_id();
-    cpu->can_do_io =3D 1;
-    current_cpu =3D cpu;
-
-    hvf_init_vcpu(cpu);
-
-    /* signal CPU creation */
-    cpu->created =3D true;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_guest_random_seed_thread_part2(cpu->random_seed);
-
-    do {
-        if (cpu_can_run(cpu)) {
-            r =3D hvf_vcpu_exec(cpu);
-            if (r =3D=3D EXCP_DEBUG) {
-                cpu_handle_guest_debug(cpu);
-            }
-        }
-        qemu_wait_io_event(cpu);
-    } while (!cpu->unplug || cpu_can_run(cpu));
-
-    hvf_vcpu_destroy(cpu);
-    cpu->created =3D false;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_mutex_unlock_iothread();
-    rcu_unregister_thread();
-    return NULL;
-}
-
-static void *qemu_whpx_cpu_thread_fn(void *arg)
-{
-    CPUState *cpu =3D arg;
-    int r;
-
-    rcu_register_thread();
-
-    qemu_mutex_lock_iothread();
-    qemu_thread_get_self(cpu->thread);
-    cpu->thread_id =3D qemu_get_thread_id();
-    current_cpu =3D cpu;
-
-    r =3D whpx_init_vcpu(cpu);
-    if (r < 0) {
-        fprintf(stderr, "whpx_init_vcpu failed: %s\n", strerror(-r));
-        exit(1);
-    }
-
-    /* signal CPU creation */
-    cpu->created =3D true;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_guest_random_seed_thread_part2(cpu->random_seed);
-
-    do {
-        if (cpu_can_run(cpu)) {
-            r =3D whpx_vcpu_exec(cpu);
-            if (r =3D=3D EXCP_DEBUG) {
-                cpu_handle_guest_debug(cpu);
-            }
-        }
-        while (cpu_thread_is_idle(cpu)) {
-            qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
-        }
-        qemu_wait_io_event_common(cpu);
-    } while (!cpu->unplug || cpu_can_run(cpu));
-
-    whpx_destroy_vcpu(cpu);
-    cpu->created =3D false;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_mutex_unlock_iothread();
-    rcu_unregister_thread();
-    return NULL;
-}
-
-#ifdef _WIN32
-static void CALLBACK dummy_apc_func(ULONG_PTR unused)
-{
-}
-#endif
-
-/* Multi-threaded TCG
- *
- * In the multi-threaded case each vCPU has its own thread. The TLS
- * variable current_cpu can be used deep in the code to find the
- * current CPUState for a given thread.
- */
-
-static void *qemu_tcg_cpu_thread_fn(void *arg)
-{
-    CPUState *cpu =3D arg;
-
-    assert(tcg_enabled());
-    g_assert(!icount_enabled());
-
-    rcu_register_thread();
-    tcg_register_thread();
-
-    qemu_mutex_lock_iothread();
-    qemu_thread_get_self(cpu->thread);
-
-    cpu->thread_id =3D qemu_get_thread_id();
-    cpu->created =3D true;
-    cpu->can_do_io =3D 1;
-    current_cpu =3D cpu;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_guest_random_seed_thread_part2(cpu->random_seed);
-
-    /* process any pending work */
-    cpu->exit_request =3D 1;
-
-    do {
-        if (cpu_can_run(cpu)) {
-            int r;
-            qemu_mutex_unlock_iothread();
-            r =3D tcg_cpu_exec(cpu);
-            qemu_mutex_lock_iothread();
-            switch (r) {
-            case EXCP_DEBUG:
-                cpu_handle_guest_debug(cpu);
-                break;
-            case EXCP_HALTED:
-                /* during start-up the vCPU is reset and the thread is
-                 * kicked several times. If we don't ensure we go back
-                 * to sleep in the halted state we won't cleanly
-                 * start-up when the vCPU is enabled.
-                 *
-                 * cpu->halted should ensure we sleep in wait_io_event
-                 */
-                g_assert(cpu->halted);
-                break;
-            case EXCP_ATOMIC:
-                qemu_mutex_unlock_iothread();
-                cpu_exec_step_atomic(cpu);
-                qemu_mutex_lock_iothread();
-            default:
-                /* Ignore everything else? */
-                break;
-            }
-        }
-
-        atomic_mb_set(&cpu->exit_request, 0);
-        qemu_wait_io_event(cpu);
-    } while (!cpu->unplug || cpu_can_run(cpu));
-
-    qemu_tcg_destroy_vcpu(cpu);
-    cpu->created =3D false;
-    qemu_cond_signal(&qemu_cpu_cond);
-    qemu_mutex_unlock_iothread();
-    rcu_unregister_thread();
-    return NULL;
-}
-
-static void qemu_cpu_kick_thread(CPUState *cpu)
+void cpus_kick_thread(CPUState *cpu)
 {
 #ifndef _WIN32
     int err;
@@ -993,44 +353,20 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
         fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
         exit(1);
     }
-#else /* _WIN32 */
-    if (!qemu_cpu_is_self(cpu)) {
-        if (whpx_enabled()) {
-            whpx_vcpu_kick(cpu);
-        } else if (!QueueUserAPC(dummy_apc_func, cpu->hThread, 0)) {
-            fprintf(stderr, "%s: QueueUserAPC failed with error %lu\n",
-                    __func__, GetLastError());
-            exit(1);
-        }
-    }
 #endif
 }
=20
 void qemu_cpu_kick(CPUState *cpu)
 {
     qemu_cond_broadcast(cpu->halt_cond);
-    if (tcg_enabled()) {
-        if (qemu_tcg_mttcg_enabled()) {
-            cpu_exit(cpu);
-        } else {
-            qemu_cpu_kick_rr_cpus();
-        }
-    } else {
-        if (hax_enabled()) {
-            /*
-             * FIXME: race condition with the exit_request check in
-             * hax_vcpu_hax_exec
-             */
-            cpu->exit_request =3D 1;
-        }
-        qemu_cpu_kick_thread(cpu);
-    }
+    assert(accel_int.kick_vcpu_thread !=3D NULL);
+    accel_int.kick_vcpu_thread(cpu);
 }
=20
 void qemu_cpu_kick_self(void)
 {
     assert(current_cpu);
-    qemu_cpu_kick_thread(current_cpu);
+    cpus_kick_thread(current_cpu);
 }
=20
 bool qemu_cpu_is_self(CPUState *cpu)
@@ -1080,6 +416,21 @@ void qemu_cond_timedwait_iothread(QemuCond *cond, int=
 ms)
     qemu_cond_timedwait(cond, &qemu_global_mutex, ms);
 }
=20
+/* signal CPU creation */
+void cpu_thread_signal_created(CPUState *cpu)
+{
+    cpu->created =3D true;
+    qemu_cond_signal(&qemu_cpu_cond);
+}
+
+/* signal CPU destruction */
+void cpu_thread_signal_destroyed(CPUState *cpu)
+{
+    cpu->created =3D false;
+    qemu_cond_signal(&qemu_cpu_cond);
+}
+
+
 static bool all_vcpus_paused(void)
 {
     CPUState *cpu;
@@ -1155,140 +506,18 @@ void cpu_remove_sync(CPUState *cpu)
     qemu_mutex_lock_iothread();
 }
=20
-/* For temporary buffers for forming a name */
-#define VCPU_THREAD_NAME_SIZE 16
-
-static void qemu_tcg_init_vcpu(CPUState *cpu)
-{
-    char thread_name[VCPU_THREAD_NAME_SIZE];
-    static QemuCond *single_tcg_halt_cond;
-    static QemuThread *single_tcg_cpu_thread;
-    static int tcg_region_inited;
-
-    assert(tcg_enabled());
-    /*
-     * Initialize TCG regions--once. Now is a good time, because:
-     * (1) TCG's init context, prologue and target globals have been set u=
p.
-     * (2) qemu_tcg_mttcg_enabled() works now (TCG init code runs before t=
he
-     *     -accel flag is processed, so the check doesn't work then).
-     */
-    if (!tcg_region_inited) {
-        tcg_region_inited =3D 1;
-        tcg_region_init();
-    }
-
-    if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
-        cpu->thread =3D g_malloc0(sizeof(QemuThread));
-        cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
-        qemu_cond_init(cpu->halt_cond);
-
-        if (qemu_tcg_mttcg_enabled()) {
-            /* create a thread per vCPU with TCG (MTTCG) */
-            parallel_cpus =3D true;
-            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
-                 cpu->cpu_index);
-
-            qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thre=
ad_fn,
-                               cpu, QEMU_THREAD_JOINABLE);
-
-        } else {
-            /* share a single thread for all cpus with TCG */
-            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
-            qemu_thread_create(cpu->thread, thread_name,
-                               qemu_tcg_rr_cpu_thread_fn,
-                               cpu, QEMU_THREAD_JOINABLE);
-
-            single_tcg_halt_cond =3D cpu->halt_cond;
-            single_tcg_cpu_thread =3D cpu->thread;
-        }
-#ifdef _WIN32
-        cpu->hThread =3D qemu_thread_get_handle(cpu->thread);
-#endif
-    } else {
-        /* For non-MTTCG cases we share the thread */
-        cpu->thread =3D single_tcg_cpu_thread;
-        cpu->halt_cond =3D single_tcg_halt_cond;
-        cpu->thread_id =3D first_cpu->thread_id;
-        cpu->can_do_io =3D 1;
-        cpu->created =3D true;
-    }
-}
-
-static void qemu_hax_start_vcpu(CPUState *cpu)
-{
-    char thread_name[VCPU_THREAD_NAME_SIZE];
-
-    cpu->thread =3D g_malloc0(sizeof(QemuThread));
-    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
-    qemu_cond_init(cpu->halt_cond);
-
-    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HAX",
-             cpu->cpu_index);
-    qemu_thread_create(cpu->thread, thread_name, qemu_hax_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE);
-#ifdef _WIN32
-    cpu->hThread =3D qemu_thread_get_handle(cpu->thread);
-#endif
-}
-
-static void qemu_kvm_start_vcpu(CPUState *cpu)
+void cpus_register_accel_interface(CpusAccelInterface *i)
 {
-    char thread_name[VCPU_THREAD_NAME_SIZE];
-
-    cpu->thread =3D g_malloc0(sizeof(QemuThread));
-    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
-    qemu_cond_init(cpu->halt_cond);
-    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
-             cpu->cpu_index);
-    qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE);
-}
-
-static void qemu_hvf_start_vcpu(CPUState *cpu)
-{
-    char thread_name[VCPU_THREAD_NAME_SIZE];
-
-    /* HVF currently does not support TCG, and only runs in
-     * unrestricted-guest mode. */
-    assert(hvf_enabled());
-
-    cpu->thread =3D g_malloc0(sizeof(QemuThread));
-    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
-    qemu_cond_init(cpu->halt_cond);
+    assert(i !=3D NULL);
+    assert(i->create_vcpu_thread !=3D NULL);
+    assert(i->kick_vcpu_thread !=3D NULL);
=20
-    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HVF",
-             cpu->cpu_index);
-    qemu_thread_create(cpu->thread, thread_name, qemu_hvf_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE);
-}
-
-static void qemu_whpx_start_vcpu(CPUState *cpu)
-{
-    char thread_name[VCPU_THREAD_NAME_SIZE];
-
-    cpu->thread =3D g_malloc0(sizeof(QemuThread));
-    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
-    qemu_cond_init(cpu->halt_cond);
-    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/WHPX",
-             cpu->cpu_index);
-    qemu_thread_create(cpu->thread, thread_name, qemu_whpx_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE);
-#ifdef _WIN32
-    cpu->hThread =3D qemu_thread_get_handle(cpu->thread);
-#endif
-}
+    assert(i->cpu_synchronize_post_reset !=3D NULL);
+    assert(i->cpu_synchronize_post_init !=3D NULL);
+    assert(i->cpu_synchronize_state !=3D NULL);
+    assert(i->cpu_synchronize_pre_loadvm !=3D NULL);
=20
-static void qemu_dummy_start_vcpu(CPUState *cpu)
-{
-    char thread_name[VCPU_THREAD_NAME_SIZE];
-
-    cpu->thread =3D g_malloc0(sizeof(QemuThread));
-    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
-    qemu_cond_init(cpu->halt_cond);
-    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY",
-             cpu->cpu_index);
-    qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn,=
 cpu,
-                       QEMU_THREAD_JOINABLE);
+    accel_int =3D *i;
 }
=20
 void qemu_init_vcpu(CPUState *cpu)
@@ -1308,19 +537,8 @@ void qemu_init_vcpu(CPUState *cpu)
         cpu_address_space_init(cpu, 0, "cpu-memory", cpu->memory);
     }
=20
-    if (kvm_enabled()) {
-        qemu_kvm_start_vcpu(cpu);
-    } else if (hax_enabled()) {
-        qemu_hax_start_vcpu(cpu);
-    } else if (hvf_enabled()) {
-        qemu_hvf_start_vcpu(cpu);
-    } else if (tcg_enabled()) {
-        qemu_tcg_init_vcpu(cpu);
-    } else if (whpx_enabled()) {
-        qemu_whpx_start_vcpu(cpu);
-    } else {
-        qemu_dummy_start_vcpu(cpu);
-    }
+    assert(accel_int.create_vcpu_thread !=3D NULL);
+    accel_int.create_vcpu_thread(cpu);
=20
     while (!cpu->created) {
         qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
@@ -1498,3 +716,26 @@ void qmp_inject_nmi(Error **errp)
     nmi_monitor_handle(monitor_get_cpu_index(), errp);
 }
=20
+void cpu_synchronize_state(CPUState *cpu)
+{
+    assert(accel_int.cpu_synchronize_state !=3D NULL);
+    accel_int.cpu_synchronize_state(cpu);
+}
+
+void cpu_synchronize_post_reset(CPUState *cpu)
+{
+    assert(accel_int.cpu_synchronize_post_reset !=3D NULL);
+    accel_int.cpu_synchronize_post_reset(cpu);
+}
+
+void cpu_synchronize_post_init(CPUState *cpu)
+{
+    assert(accel_int.cpu_synchronize_post_init !=3D NULL);
+    accel_int.cpu_synchronize_post_init(cpu);
+}
+
+void cpu_synchronize_pre_loadvm(CPUState *cpu)
+{
+    assert(accel_int.cpu_synchronize_pre_loadvm !=3D NULL);
+    accel_int.cpu_synchronize_pre_loadvm(cpu);
+}
diff --git a/hw/core/cpu.c b/hw/core/cpu.c
index 5284d384fb..b601654a10 100644
--- a/hw/core/cpu.c
+++ b/hw/core/cpu.c
@@ -33,6 +33,7 @@
 #include "hw/qdev-properties.h"
 #include "trace-root.h"
 #include "qemu/plugin.h"
+#include "sysemu/hw_accel.h"
=20
 CPUInterruptHandler cpu_interrupt_handler;
=20
diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
index 149de000a0..b9d76ebda8 100644
--- a/include/sysemu/cpus.h
+++ b/include/sysemu/cpus.h
@@ -4,7 +4,51 @@
 #include "qemu/timer.h"
=20
 /* cpus.c */
+
+/* CPU execution threads */
+
+typedef struct CpusAccelInterface {
+    void (*create_vcpu_thread)(CPUState *cpu);
+    void (*kick_vcpu_thread)(CPUState *cpu);
+
+    void (*cpu_synchronize_post_reset)(CPUState *cpu);
+    void (*cpu_synchronize_post_init)(CPUState *cpu);
+    void (*cpu_synchronize_state)(CPUState *cpu);
+    void (*cpu_synchronize_pre_loadvm)(CPUState *cpu);
+} CpusAccelInterface;
+
+/* register accel-specific interface */
+void cpus_register_accel_interface(CpusAccelInterface *i);
+
+/*
+ * these are the registrable vcpu start functions for all accelerators.
+ * They are arguments to qemu_register_start_vcpu();
+ */
+void qemu_dummy_start_vcpu(CPUState *cpu);
+void qemu_tcg_init_vcpu(CPUState *cpu);
+void qemu_kvm_start_vcpu(CPUState *cpu);
+void qemu_hax_start_vcpu(CPUState *cpu);
+void qemu_hvf_start_vcpu(CPUState *cpu);
+void qemu_whpx_start_vcpu(CPUState *cpu);
+/* end of vcpu start functions for accelerators */
+
+/* interface available for cpus accelerator threads */
+
+/* For temporary buffers for forming a name */
+#define VCPU_THREAD_NAME_SIZE 16
+
+void cpus_kick_thread(CPUState *cpu);
+bool cpu_thread_is_idle(CPUState *cpu);
 bool all_cpu_threads_idle(void);
+bool cpu_can_run(CPUState *cpu);
+void qemu_wait_io_event_common(CPUState *cpu);
+void qemu_wait_io_event(CPUState *cpu);
+void cpu_thread_signal_created(CPUState *cpu);
+void cpu_thread_signal_destroyed(CPUState *cpu);
+void cpu_handle_guest_debug(CPUState *cpu);
+
+/* end interface for cpus accelerator threads */
+
 bool qemu_in_vcpu_thread(void);
 void qemu_init_cpu_loop(void);
 void resume_all_vcpus(void);
diff --git a/include/sysemu/hvf.h b/include/sysemu/hvf.h
index d211e808e9..cdd4172b24 100644
--- a/include/sysemu/hvf.h
+++ b/include/sysemu/hvf.h
@@ -86,7 +86,6 @@ int hvf_smp_cpu_exec(CPUState *);
 void hvf_cpu_synchronize_state(CPUState *);
 void hvf_cpu_synchronize_post_reset(CPUState *);
 void hvf_cpu_synchronize_post_init(CPUState *);
-void _hvf_cpu_synchronize_post_init(CPUState *, run_on_cpu_data);
=20
 void hvf_vcpu_destroy(CPUState *);
 void hvf_raise_event(CPUState *);
diff --git a/include/sysemu/hw_accel.h b/include/sysemu/hw_accel.h
index 0ec2372477..336740e10a 100644
--- a/include/sysemu/hw_accel.h
+++ b/include/sysemu/hw_accel.h
@@ -1,5 +1,5 @@
 /*
- * QEMU Hardware accelertors support
+ * QEMU Hardware accelerators support
  *
  * Copyright 2016 Google, Inc.
  *
@@ -16,56 +16,9 @@
 #include "sysemu/kvm.h"
 #include "sysemu/whpx.h"
=20
-static inline void cpu_synchronize_state(CPUState *cpu)
-{
-    if (kvm_enabled()) {
-        kvm_cpu_synchronize_state(cpu);
-    }
-    if (hax_enabled()) {
-        hax_cpu_synchronize_state(cpu);
-    }
-    if (whpx_enabled()) {
-        whpx_cpu_synchronize_state(cpu);
-    }
-}
-
-static inline void cpu_synchronize_post_reset(CPUState *cpu)
-{
-    if (kvm_enabled()) {
-        kvm_cpu_synchronize_post_reset(cpu);
-    }
-    if (hax_enabled()) {
-        hax_cpu_synchronize_post_reset(cpu);
-    }
-    if (whpx_enabled()) {
-        whpx_cpu_synchronize_post_reset(cpu);
-    }
-}
-
-static inline void cpu_synchronize_post_init(CPUState *cpu)
-{
-    if (kvm_enabled()) {
-        kvm_cpu_synchronize_post_init(cpu);
-    }
-    if (hax_enabled()) {
-        hax_cpu_synchronize_post_init(cpu);
-    }
-    if (whpx_enabled()) {
-        whpx_cpu_synchronize_post_init(cpu);
-    }
-}
-
-static inline void cpu_synchronize_pre_loadvm(CPUState *cpu)
-{
-    if (kvm_enabled()) {
-        kvm_cpu_synchronize_pre_loadvm(cpu);
-    }
-    if (hax_enabled()) {
-        hax_cpu_synchronize_pre_loadvm(cpu);
-    }
-    if (whpx_enabled()) {
-        whpx_cpu_synchronize_pre_loadvm(cpu);
-    }
-}
+void cpu_synchronize_state(CPUState *cpu);
+void cpu_synchronize_post_reset(CPUState *cpu);
+void cpu_synchronize_post_init(CPUState *cpu);
+void cpu_synchronize_pre_loadvm(CPUState *cpu);
=20
 #endif /* QEMU_HW_ACCEL_H */
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 3b2250471c..3d84c940db 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -218,7 +218,7 @@ int kvm_has_intx_set_mask(void);
=20
 int kvm_init_vcpu(CPUState *cpu);
 int kvm_cpu_exec(CPUState *cpu);
-int kvm_destroy_vcpu(CPUState *cpu);
+void kvm_destroy_vcpu(CPUState *cpu);
=20
 /**
  * kvm_arm_supports_user_irq
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 45be5dc0ed..3bbd72b7cd 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -43,4 +43,5 @@ stub-obj-y +=3D pci-host-piix.o
 stub-obj-y +=3D ram-block.o
 stub-obj-y +=3D ramfb.o
 stub-obj-y +=3D fw_cfg.o
+stub-obj-y +=3D cpu-synchronize-state.o
 stub-obj-$(CONFIG_SOFTMMU) +=3D semihost.o
diff --git a/stubs/cpu-synchronize-state.c b/stubs/cpu-synchronize-state.c
new file mode 100644
index 0000000000..3112fe439d
--- /dev/null
+++ b/stubs/cpu-synchronize-state.c
@@ -0,0 +1,15 @@
+#include "qemu/osdep.h"
+#include "sysemu/hw_accel.h"
+
+void cpu_synchronize_state(CPUState *cpu)
+{
+}
+void cpu_synchronize_post_reset(CPUState *cpu)
+{
+}
+void cpu_synchronize_post_init(CPUState *cpu)
+{
+}
+void cpu_synchronize_pre_loadvm(CPUState *cpu)
+{
+}
diff --git a/target/i386/Makefile.objs b/target/i386/Makefile.objs
index 48e0c28434..a847064e68 100644
--- a/target/i386/Makefile.objs
+++ b/target/i386/Makefile.objs
@@ -9,14 +9,15 @@ obj-y +=3D machine.o arch_memory_mapping.o arch_dump.o mo=
nitor.o
 obj-$(CONFIG_KVM) +=3D kvm.o
 obj-$(CONFIG_HYPERV) +=3D hyperv.o
 obj-$(call lnot,$(CONFIG_HYPERV)) +=3D hyperv-stub.o
+obj-$(CONFIG_HAX) +=3D hax-all.o hax-mem.o hax-cpus-interface.o
 ifeq ($(CONFIG_WIN32),y)
-obj-$(CONFIG_HAX) +=3D hax-all.o hax-mem.o hax-windows.o
+obj-$(CONFIG_HAX) +=3D hax-windows.o
 endif
 ifeq ($(CONFIG_POSIX),y)
-obj-$(CONFIG_HAX) +=3D hax-all.o hax-mem.o hax-posix.o
+obj-$(CONFIG_HAX) +=3D hax-posix.o
 endif
 obj-$(CONFIG_HVF) +=3D hvf/
-obj-$(CONFIG_WHPX) +=3D whpx-all.o
+obj-$(CONFIG_WHPX) +=3D whpx-all.o whpx-cpus-interface.o
 endif
 obj-$(CONFIG_SEV) +=3D sev.o
 obj-$(call lnot,$(CONFIG_SEV)) +=3D sev-stub.o
diff --git a/target/i386/hax-all.c b/target/i386/hax-all.c
index f9c83fff25..56c99bffd2 100644
--- a/target/i386/hax-all.c
+++ b/target/i386/hax-all.c
@@ -32,9 +32,10 @@
 #include "sysemu/accel.h"
 #include "sysemu/reset.h"
 #include "sysemu/runstate.h"
-#include "qemu/main-loop.h"
 #include "hw/boards.h"
=20
+#include "hax-cpus-interface.h"
+
 #define DEBUG_HAX 0
=20
 #define DPRINTF(fmt, ...) \
@@ -361,6 +362,9 @@ static int hax_accel_init(MachineState *ms)
                 !ret ? "working" : "not working",
                 !ret ? "fast virt" : "emulation");
     }
+    if (ret =3D=3D 0) {
+        cpus_register_accel_interface(&hax_cpus_interface);
+    }
     return ret;
 }
=20
diff --git a/target/i386/hax-cpus-interface.c b/target/i386/hax-cpus-interf=
ace.c
new file mode 100644
index 0000000000..85cbfb4ae8
--- /dev/null
+++ b/target/i386/hax-cpus-interface.c
@@ -0,0 +1,85 @@
+/*
+ * QEMU HAX support
+ *
+ * Copyright IBM, Corp. 2008
+ *           Red Hat, Inc. 2008
+ *
+ * Authors:
+ *  Anthony Liguori   <aliguori@us.ibm.com>
+ *  Glauber Costa     <gcosta@redhat.com>
+ *
+ * Copyright (c) 2011 Intel Corporation
+ *  Written by:
+ *  Jiang Yunhong<yunhong.jiang@intel.com>
+ *  Xin Xiaohui<xiaohui.xin@intel.com>
+ *  Zhang Xiantao<xiantao.zhang@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or late=
r.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qemu/main-loop.h"
+#include "hax-i386.h"
+#include "sysemu/runstate.h"
+#include "sysemu/cpus.h"
+#include "qemu/guest-random.h"
+
+#include "hax-cpus-interface.h"
+
+static void *hax_cpu_thread_fn(void *arg)
+{
+    CPUState *cpu =3D arg;
+    int r;
+
+    rcu_register_thread();
+    qemu_mutex_lock_iothread();
+    qemu_thread_get_self(cpu->thread);
+
+    cpu->thread_id =3D qemu_get_thread_id();
+    hax_init_vcpu(cpu);
+    cpu_thread_signal_created(cpu);
+    qemu_guest_random_seed_thread_part2(cpu->random_seed);
+
+    do {
+        if (cpu_can_run(cpu)) {
+            r =3D hax_smp_cpu_exec(cpu);
+            if (r =3D=3D EXCP_DEBUG) {
+                cpu_handle_guest_debug(cpu);
+            }
+        }
+
+        qemu_wait_io_event(cpu);
+    } while (!cpu->unplug || cpu_can_run(cpu));
+    rcu_unregister_thread();
+    return NULL;
+}
+
+static void hax_start_vcpu_thread(CPUState *cpu)
+{
+    char thread_name[VCPU_THREAD_NAME_SIZE];
+
+    cpu->thread =3D g_malloc0(sizeof(QemuThread));
+    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
+    qemu_cond_init(cpu->halt_cond);
+
+    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HAX",
+             cpu->cpu_index);
+    qemu_thread_create(cpu->thread, thread_name, hax_cpu_thread_fn,
+                       cpu, QEMU_THREAD_JOINABLE);
+#ifdef _WIN32
+    cpu->hThread =3D qemu_thread_get_handle(cpu->thread);
+#endif
+}
+
+CpusAccelInterface hax_cpus_interface =3D {
+    .create_vcpu_thread =3D hax_start_vcpu_thread,
+    .kick_vcpu_thread =3D hax_kick_vcpu_thread,
+
+    .cpu_synchronize_post_reset =3D hax_cpu_synchronize_post_reset,
+    .cpu_synchronize_post_init =3D hax_cpu_synchronize_post_init,
+    .cpu_synchronize_state =3D hax_cpu_synchronize_state,
+    .cpu_synchronize_pre_loadvm =3D hax_cpu_synchronize_pre_loadvm,
+};
diff --git a/target/i386/hax-cpus-interface.h b/target/i386/hax-cpus-interf=
ace.h
new file mode 100644
index 0000000000..a06f3df752
--- /dev/null
+++ b/target/i386/hax-cpus-interface.h
@@ -0,0 +1,8 @@
+#ifndef HAX_CPUS_INTERFACE_H
+#define HAX_CPUS_INTERFACE_H
+
+#include "sysemu/cpus.h"
+
+extern CpusAccelInterface hax_cpus_interface;
+
+#endif /* HAX_CPUS_INTERFACE */
diff --git a/target/i386/hax-i386.h b/target/i386/hax-i386.h
index 54e9d8b057..667139c7af 100644
--- a/target/i386/hax-i386.h
+++ b/target/i386/hax-i386.h
@@ -61,6 +61,8 @@ int hax_inject_interrupt(CPUArchState *env, int vector);
 struct hax_vm *hax_vm_create(struct hax_state *hax);
 int hax_vcpu_run(struct hax_vcpu_state *vcpu);
 int hax_vcpu_create(int id);
+void hax_kick_vcpu_thread(CPUState *cpu);
+
 int hax_sync_vcpu_state(CPUArchState *env, struct vcpu_state_t *state,
                         int set);
 int hax_sync_msr(CPUArchState *env, struct hax_msr_data *msrs, int set);
diff --git a/target/i386/hax-posix.c b/target/i386/hax-posix.c
index 3bad89f133..ea956ddfc1 100644
--- a/target/i386/hax-posix.c
+++ b/target/i386/hax-posix.c
@@ -16,6 +16,8 @@
=20
 #include "target/i386/hax-i386.h"
=20
+#include "sysemu/cpus.h"
+
 hax_fd hax_mod_open(void)
 {
     int fd =3D open("/dev/HAX", O_RDWR);
@@ -292,3 +294,13 @@ int hax_inject_interrupt(CPUArchState *env, int vector)
=20
     return ioctl(fd, HAX_VCPU_IOCTL_INTERRUPT, &vector);
 }
+
+void hax_kick_vcpu_thread(CPUState *cpu)
+{
+    /*
+     * FIXME: race condition with the exit_request check in
+     * hax_vcpu_hax_exec
+     */
+    cpu->exit_request =3D 1;
+    cpus_kick_thread(cpu);
+}
diff --git a/target/i386/hax-windows.c b/target/i386/hax-windows.c
index 863c2bcc19..469b48e608 100644
--- a/target/i386/hax-windows.c
+++ b/target/i386/hax-windows.c
@@ -463,3 +463,23 @@ int hax_inject_interrupt(CPUArchState *env, int vector)
         return 0;
     }
 }
+
+static void CALLBACK dummy_apc_func(ULONG_PTR unused)
+{
+}
+
+void hax_kick_vcpu_thread(CPUState *cpu)
+{
+    /*
+     * FIXME: race condition with the exit_request check in
+     * hax_vcpu_hax_exec
+     */
+    cpu->exit_request =3D 1;
+    if (!qemu_cpu_is_self(cpu)) {
+        if (!QueueUserAPC(dummy_apc_func, cpu->hThread, 0)) {
+            fprintf(stderr, "%s: QueueUserAPC failed with error %lu\n",
+                    __func__, GetLastError());
+            exit(1);
+        }
+    }
+}
diff --git a/target/i386/hvf/Makefile.objs b/target/i386/hvf/Makefile.objs
index 927b86bc67..bdbc2c0227 100644
--- a/target/i386/hvf/Makefile.objs
+++ b/target/i386/hvf/Makefile.objs
@@ -1,2 +1,2 @@
-obj-y +=3D hvf.o
+obj-y +=3D hvf.o hvf-cpus-interface.o
 obj-y +=3D x86.o x86_cpuid.o x86_decode.o x86_descr.o x86_emu.o x86_flags.=
o x86_mmu.o x86hvf.o x86_task.o
diff --git a/target/i386/hvf/hvf-cpus-interface.c b/target/i386/hvf/hvf-cpu=
s-interface.c
new file mode 100644
index 0000000000..54bfe307c1
--- /dev/null
+++ b/target/i386/hvf/hvf-cpus-interface.c
@@ -0,0 +1,92 @@
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qemu/main-loop.h"
+#include "sysemu/hvf.h"
+#include "sysemu/runstate.h"
+#include "sysemu/cpus.h"
+#include "qemu/guest-random.h"
+
+#include "hvf-cpus-interface.h"
+
+/*
+ * The HVF-specific vCPU thread function. This one should only run when th=
e host
+ * CPU supports the VMX "unrestricted guest" feature.
+ */
+static void *hvf_cpu_thread_fn(void *arg)
+{
+    CPUState *cpu =3D arg;
+
+    int r;
+
+    assert(hvf_enabled());
+
+    rcu_register_thread();
+
+    qemu_mutex_lock_iothread();
+    qemu_thread_get_self(cpu->thread);
+
+    cpu->thread_id =3D qemu_get_thread_id();
+    cpu->can_do_io =3D 1;
+    current_cpu =3D cpu;
+
+    hvf_init_vcpu(cpu);
+
+    /* signal CPU creation */
+    cpu_thread_signal_created(cpu);
+    qemu_guest_random_seed_thread_part2(cpu->random_seed);
+
+    do {
+        if (cpu_can_run(cpu)) {
+            r =3D hvf_vcpu_exec(cpu);
+            if (r =3D=3D EXCP_DEBUG) {
+                cpu_handle_guest_debug(cpu);
+            }
+        }
+        qemu_wait_io_event(cpu);
+    } while (!cpu->unplug || cpu_can_run(cpu));
+
+    hvf_vcpu_destroy(cpu);
+    cpu_thread_signal_destroyed(cpu);
+    qemu_mutex_unlock_iothread();
+    rcu_unregister_thread();
+    return NULL;
+}
+
+static void hvf_kick_vcpu_thread(CPUState *cpu)
+{
+    cpus_kick_thread(cpu);
+}
+
+static void hvf_cpu_synchronize_noop(CPUState *cpu)
+{
+}
+
+static void hvf_start_vcpu_thread(CPUState *cpu)
+{
+    char thread_name[VCPU_THREAD_NAME_SIZE];
+
+    /*
+     * HVF currently does not support TCG, and only runs in
+     * unrestricted-guest mode.
+     */
+    assert(hvf_enabled());
+
+    cpu->thread =3D g_malloc0(sizeof(QemuThread));
+    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
+    qemu_cond_init(cpu->halt_cond);
+
+    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HVF",
+             cpu->cpu_index);
+    qemu_thread_create(cpu->thread, thread_name, hvf_cpu_thread_fn,
+                       cpu, QEMU_THREAD_JOINABLE);
+}
+
+CpusAccelInterface hvf_cpus_interface =3D {
+    .create_vcpu_thread =3D hvf_start_vcpu_thread,
+    .kick_vcpu_thread =3D hvf_kick_vcpu_thread,
+
+    .cpu_synchronize_post_reset =3D hvf_cpu_synchronize_noop,
+    .cpu_synchronize_post_init =3D hvf_cpu_synchronize_noop,
+    .cpu_synchronize_state =3D hvf_cpu_synchronize_noop,
+    .cpu_synchronize_pre_loadvm =3D hvf_cpu_synchronize_noop,
+};
diff --git a/target/i386/hvf/hvf-cpus-interface.h b/target/i386/hvf/hvf-cpu=
s-interface.h
new file mode 100644
index 0000000000..6ea38742e5
--- /dev/null
+++ b/target/i386/hvf/hvf-cpus-interface.h
@@ -0,0 +1,8 @@
+#ifndef HVF_CPUS_INTERFACE_H
+#define HVF_CPUS_INTERFACE_H
+
+#include "sysemu/cpus.h"
+
+extern CpusAccelInterface hvf_cpus_interface;
+
+#endif /* HVF_CPUS_INTERFACE */
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index d72543dc31..75cdf88cec 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -72,6 +72,8 @@
 #include "sysemu/accel.h"
 #include "target/i386/cpu.h"
=20
+#include "hvf-cpus-interface.h"
+
 HVFState *hvf_state;
=20
 static void assert_hvf_ok(hv_return_t ret)
@@ -312,7 +314,7 @@ void hvf_cpu_synchronize_post_reset(CPUState *cpu_state)
     run_on_cpu(cpu_state, do_hvf_cpu_synchronize_post_reset, RUN_ON_CPU_NU=
LL);
 }
=20
-void _hvf_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
+static void _hvf_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data =
arg)
 {
     CPUState *cpu_state =3D cpu;
     hvf_put_registers(cpu_state);
@@ -979,6 +981,7 @@ static int hvf_accel_init(MachineState *ms)
     hvf_state =3D s;
     cpu_interrupt_handler =3D hvf_handle_interrupt;
     memory_listener_register(&hvf_memory_listener, &address_space_memory);
+    cpus_register_accel_interface(&hvf_cpus_interface);
     return 0;
 }
=20
diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
index c78baac6df..daac9858ae 100644
--- a/target/i386/whpx-all.c
+++ b/target/i386/whpx-all.c
@@ -24,6 +24,8 @@
 #include "migration/blocker.h"
 #include "whp-dispatch.h"
=20
+#include "whpx-cpus-interface.h"
+
 #include <WinHvPlatform.h>
 #include <WinHvEmulation.h>
=20
@@ -1575,6 +1577,7 @@ static int whpx_accel_init(MachineState *ms)
     whpx_memory_init();
=20
     cpu_interrupt_handler =3D whpx_handle_interrupt;
+    cpus_register_accel_interface(&whpx_cpus_interface);
=20
     printf("Windows Hypervisor Platform accelerator is operational\n");
     return 0;
diff --git a/target/i386/whpx-cpus-interface.c b/target/i386/whpx-cpus-inte=
rface.c
new file mode 100644
index 0000000000..d2a4ef0699
--- /dev/null
+++ b/target/i386/whpx-cpus-interface.c
@@ -0,0 +1,96 @@
+/*
+ * QEMU Windows Hypervisor Platform accelerator (WHPX)
+ *
+ * Copyright Microsoft Corp. 2017
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or late=
r.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/kvm_int.h"
+#include "qemu/main-loop.h"
+#include "sysemu/cpus.h"
+#include "qemu/guest-random.h"
+
+#include "sysemu/whpx.h"
+#include "whpx-cpus-interface.h"
+
+#include <WinHvPlatform.h>
+#include <WinHvEmulation.h>
+
+static void *whpx_cpu_thread_fn(void *arg)
+{
+    CPUState *cpu =3D arg;
+    int r;
+
+    rcu_register_thread();
+
+    qemu_mutex_lock_iothread();
+    qemu_thread_get_self(cpu->thread);
+    cpu->thread_id =3D qemu_get_thread_id();
+    current_cpu =3D cpu;
+
+    r =3D whpx_init_vcpu(cpu);
+    if (r < 0) {
+        fprintf(stderr, "whpx_init_vcpu failed: %s\n", strerror(-r));
+        exit(1);
+    }
+
+    /* signal CPU creation */
+    cpu_thread_signal_created(cpu);
+    qemu_guest_random_seed_thread_part2(cpu->random_seed);
+
+    do {
+        if (cpu_can_run(cpu)) {
+            r =3D whpx_vcpu_exec(cpu);
+            if (r =3D=3D EXCP_DEBUG) {
+                cpu_handle_guest_debug(cpu);
+            }
+        }
+        while (cpu_thread_is_idle(cpu)) {
+            qemu_cond_wait_iothread(cpu->halt_cond);
+        }
+        qemu_wait_io_event_common(cpu);
+    } while (!cpu->unplug || cpu_can_run(cpu));
+
+    whpx_destroy_vcpu(cpu);
+    cpu_thread_signal_destroyed(cpu);
+    qemu_mutex_unlock_iothread();
+    rcu_unregister_thread();
+    return NULL;
+}
+
+static void whpx_start_vcpu_thread(CPUState *cpu)
+{
+    char thread_name[VCPU_THREAD_NAME_SIZE];
+
+    cpu->thread =3D g_malloc0(sizeof(QemuThread));
+    cpu->halt_cond =3D g_malloc0(sizeof(QemuCond));
+    qemu_cond_init(cpu->halt_cond);
+    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/WHPX",
+             cpu->cpu_index);
+    qemu_thread_create(cpu->thread, thread_name, whpx_cpu_thread_fn,
+                       cpu, QEMU_THREAD_JOINABLE);
+#ifdef _WIN32
+    cpu->hThread =3D qemu_thread_get_handle(cpu->thread);
+#endif
+}
+
+static void whpx_kick_vcpu_thread(CPUState *cpu)
+{
+    if (!qemu_cpu_is_self(cpu)) {
+        whpx_vcpu_kick(cpu);
+    }
+}
+
+CpusAccelInterface whpx_cpus_interface =3D {
+    .create_vcpu_thread =3D whpx_start_vcpu_thread,
+    .kick_vcpu_thread =3D whpx_kick_vcpu_thread,
+
+    .cpu_synchronize_post_reset =3D whpx_cpu_synchronize_post_reset,
+    .cpu_synchronize_post_init =3D whpx_cpu_synchronize_post_init,
+    .cpu_synchronize_state =3D whpx_cpu_synchronize_state,
+    .cpu_synchronize_pre_loadvm =3D whpx_cpu_synchronize_pre_loadvm,
+};
diff --git a/target/i386/whpx-cpus-interface.h b/target/i386/whpx-cpus-inte=
rface.h
new file mode 100644
index 0000000000..084e8b15b8
--- /dev/null
+++ b/target/i386/whpx-cpus-interface.h
@@ -0,0 +1,8 @@
+#ifndef WHPX_CPUS_INTERFACE_H
+#define WHPX_CPUS_INTERFACE_H
+
+#include "sysemu/cpus.h"
+
+extern CpusAccelInterface whpx_cpus_interface;
+
+#endif /* WHPX_CPUS_INTERFACE */
--=20
2.16.4