From nobody Sun May 19 00:17:17 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1552371120173154.24976850093526; Mon, 11 Mar 2019 23:12:00 -0700 (PDT) Received: from localhost ([127.0.0.1]:45850 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h3adi-0002dU-F9 for importer@patchew.org; Tue, 12 Mar 2019 02:11:50 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50125) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h3acn-0002Gv-2k for qemu-devel@nongnu.org; Tue, 12 Mar 2019 02:10:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h3acm-0008Mf-0H for qemu-devel@nongnu.org; Tue, 12 Mar 2019 02:10:53 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:2247 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h3acg-0008Dh-W1; Tue, 12 Mar 2019 02:10:47 -0400 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id A3701D17258761CE2A13; Tue, 12 Mar 2019 14:10:38 +0800 (CST) Received: from localhost.localdomain (10.175.104.222) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.408.0; Tue, 12 Mar 2019 14:10:30 +0800 From: Heyi Guo To: , Date: Tue, 12 Mar 2019 14:09:20 +0800 Message-ID: <1552370960-2061-1-git-send-email-guoheyi@huawei.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 X-Originating-IP: [10.175.104.222] X-CFilter-Loop: Reflected X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 45.249.212.191 Subject: [Qemu-devel] [RFC] arm/cpu: fix soft lockup panic after resuming from stop X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Heyi Guo , wanghaibin.wang@huawei.com, Peter Maydell Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When we stop a VM for more than 30 seconds and then resume it, by qemu monitor command "stop" and "cont", Linux on VM will complain of "soft lockup - CPU#x stuck for xxs!" as below: [ 2783.809517] watchdog: BUG: soft lockup - CPU#3 stuck for 2395s! [ 2783.809559] watchdog: BUG: soft lockup - CPU#2 stuck for 2395s! [ 2783.809561] watchdog: BUG: soft lockup - CPU#1 stuck for 2395s! [ 2783.809563] Modules linked in... This is because Guest Linux uses generic timer virtual counter as a software watchdog, and CNTVCT_EL0 does not stop when VM is stopped by qemu. This patch is to fix this issue by saving the value of CNTVCT_EL0 when stopping and restoring it when resuming. Cc: Peter Maydell Signed-off-by: Heyi Guo --- target/arm/cpu.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 65 insertions(+) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 96f0ff0..7bbba3d 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -896,6 +896,60 @@ static void arm_cpu_finalizefn(Object *obj) #endif } =20 +static int get_vcpu_timer_tick(CPUState *cs, uint64_t *tick_at_pause) +{ + int err; + struct kvm_one_reg reg; + + reg.id =3D KVM_REG_ARM_TIMER_CNT; + reg.addr =3D (uintptr_t) tick_at_pause; + + err =3D kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®); + return err; +} + +static int set_vcpu_timer_tick(CPUState *cs, uint64_t tick_at_pause) +{ + int err; + struct kvm_one_reg reg; + + reg.id =3D KVM_REG_ARM_TIMER_CNT; + reg.addr =3D (uintptr_t) &tick_at_pause; + + err =3D kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®); + return err; +} + +static void arch_timer_change_state_handler(void *opaque, int running, + RunState state) +{ + static uint64_t hw_ticks_at_paused; + static RunState pre_state =3D RUN_STATE__MAX; + int err; + CPUState *cs =3D (CPUState *)opaque; + + switch (state) { + case RUN_STATE_PAUSED: + err =3D get_vcpu_timer_tick(cs, &hw_ticks_at_paused); + if (err) { + error_report("Get vcpu timer tick failed: %d", err); + } + break; + case RUN_STATE_RUNNING: + if (pre_state =3D=3D RUN_STATE_PAUSED) { + err =3D set_vcpu_timer_tick(cs, hw_ticks_at_paused); + if (err) { + error_report("Resume vcpu timer tick failed: %d", err); + } + } + break; + default: + break; + } + + pre_state =3D state; +} + static void arm_cpu_realizefn(DeviceState *dev, Error **errp) { CPUState *cs =3D CPU(dev); @@ -906,6 +960,12 @@ static void arm_cpu_realizefn(DeviceState *dev, Error = **errp) Error *local_err =3D NULL; bool no_aa32 =3D false; =20 + /* + * Only add change state handler for arch timer once, for KVM will hel= p to + * synchronize virtual timer of all VCPUs. + */ + static bool arch_timer_change_state_handler_added; + /* If we needed to query the host kernel for the CPU features * then it's possible that might have failed in the initfn, but * this is the first point where we can report it. @@ -1181,6 +1241,11 @@ static void arm_cpu_realizefn(DeviceState *dev, Erro= r **errp) =20 init_cpreg_list(cpu); =20 + if (!arch_timer_change_state_handler_added && kvm_enabled()) { + qemu_add_vm_change_state_handler(arch_timer_change_state_handler, = cs); + arch_timer_change_state_handler_added =3D true; + } + #ifndef CONFIG_USER_ONLY if (cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY)) { cs->num_ases =3D 2; --=20 1.8.3.1