From nobody Thu May 9 00:42:12 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org ARC-Seal: i=1; a=rsa-sha256; t=1591188788; cv=none; d=zohomail.com; s=zohoarc; b=NjbPvb05or51dMQGltr7h4PdRn/4Y+Cw+0havanjlny/Wn4RDwjugxYi5eKv+NtkC6gr1IIKp2ID+dNkVKm3Nr5WVfzzwe3DBGdR15K8xSVIzUKtBZDzcmT82cyPOPu3+NsX4LLp3tVRi+OQkK9gBo6YfO+5x5PbSQxvScCLq8w= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1591188788; h=Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=TI9bV+LeMVxfdt8lcpN5FNLTi4bVZXWnT3tiy2sCbRI=; b=ZHX50IDoRa6uDqWPTfDz/O/kRqRleyO2CV3VMraXHorLAwN2PMKmZnQxya3wHRKRVLBBHyyCQo4zlfcFUg+ZsUTgq2Hh3DSrYqabc/5N40Z+F2d6ED2bhnJRvAhG4xZ8DpMJMOp0ch4iVPhd3slRlvRQuN5cxSGnf7WhH9Hle30= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1591188788853809.8667390496596; Wed, 3 Jun 2020 05:53:08 -0700 (PDT) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1jgSsu-0001R3-Rr; Wed, 03 Jun 2020 12:52:44 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1jgSst-0001Qy-8P for xen-devel@lists.xenproject.org; Wed, 03 Jun 2020 12:52:43 +0000 Received: from mail-oi1-x243.google.com (unknown [2607:f8b0:4864:20::243]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 1cd95be2-a599-11ea-8993-bc764e2007e4; Wed, 03 Jun 2020 12:52:42 +0000 (UTC) Received: by mail-oi1-x243.google.com with SMTP id x202so1659999oix.11 for ; Wed, 03 Jun 2020 05:52:42 -0700 (PDT) Received: from localhost (c-71-205-12-124.hsd1.co.comcast.net. [71.205.12.124]) by smtp.gmail.com with ESMTPSA id 89sm471962oth.9.2020.06.03.05.52.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 03 Jun 2020 05:52:40 -0700 (PDT) X-Inumbo-ID: 1cd95be2-a599-11ea-8993-bc764e2007e4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tklengyel-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=TI9bV+LeMVxfdt8lcpN5FNLTi4bVZXWnT3tiy2sCbRI=; b=E9GQUJjq8yL4WVda/KAG/4p1XS8/EyBKH+3kntmWoYZleJfi/1S/ulX0WPTx/fnjiA OIeCKlHLVMFnZxWwY5pdsHBmCArKjhy2RkuNcCwtcjywWFydtEb7UjQBzGqGY92KDcb/ ITuRQJpS1FYT9TVvLAGlNSadM++If1X1oyqJcauW0VHVaJuEE6QQ2Yerj+S+sjJ/rXP2 OAMKv9OTRMtGqChbOooPwf3+OGDIMbAHFGHsxn/VlT3Ofm8SgFJcyG4jnnQ+cqxKqv2A bfV1qYoNdEVQsg72PNOvJPoNd93issxs52vgIf1beaIXUu18R7RnLyANwWSNPO7DrUJl BMdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=TI9bV+LeMVxfdt8lcpN5FNLTi4bVZXWnT3tiy2sCbRI=; b=Ew8YSg3dKZJJKUE+6ILmtRQT7LdLdLdzk5zmnLdI1i7GZe7mz/1oVlrTbtPf9IYLP4 xWDV8l1BkNzfY+qbmrcRxA8h0B2Mc7bIqADNWWLiJ5qZCOTT1vXSTTCklucq45vYeTdG X08JmiDDdqCE768zoQf8dItUHsicwCa1zWe8Rs27Oz840Vbe/ivUql2hJ0rUk/WWqFDS 9BEQ9Tanzj3HX457Go+MLHu/ZmvTNcZLJoS4V4s7+bFkxeY9UnxJBY6FfKJTGwRdePiK 0f7BEzVD7zVzeFX5AnUFpniC6dHQYfeDgndiKI31/JOHSnDgSUgsEBgaz3xxGyjEtHaY kZ1w== X-Gm-Message-State: AOAM531wI1p565KR6Zz7ZI/RqC7+SPpATdRbjz+ukuyBW+9rafhL3QRa 7OzDbKG6ld3y8dmAFtqHKOOM+LUduVQ= X-Google-Smtp-Source: ABdhPJx3zN5OBrSLzPMWqgMC1+eM4NK2sJzjVscfJyr+3jaLdle6IneNwN6us47Q/rxyn1/hxoiiSA== X-Received: by 2002:a54:460f:: with SMTP id p15mr5890733oip.84.1591188761256; Wed, 03 Jun 2020 05:52:41 -0700 (PDT) From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Subject: [PATCH v4 for-4.14] x86/monitor: revert default behavior when monitoring register write events Date: Wed, 3 Jun 2020 06:52:37 -0600 Message-Id: <20200603125237.100041-1-tamas@tklengyel.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Petre Pircalabu , Tamas K Lengyel , Julien Grall , Wei Liu , Andrew Cooper , Ian Jackson , George Dunlap , Stefano Stabellini , Jan Beulich , Alexandru Isaila , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" For the last couple years we have received numerous reports from users of monitor vm_events of spurious guest crashes when using events. In particula= r, it has observed that the problem occurs when vm_events are being disabled. = The nature of the guest crash varied widely and has only occured occasionally. = This made debugging the issue particularly hard. We had discussions about this i= ssue even here on the xen-devel mailinglist with no luck figuring it out. The bug has now been identified as a race-condition between register event handling and disabling the monitor vm_event interface. The default behavior regarding emulation of register write events is changed so that they get postponed until the corresponding vm_event handler decides whether to allow= such write to take place. Unfortunately this can only be implemented by performi= ng the deny/allow step when the vCPU gets scheduled. Due to that postponed emulation of the event if the user decides to pause t= he VM in the vm_event handler and then disable events, the entire emulation st= ep is skipped the next time the vCPU is resumed. Even if the user doesn't pause during the vm_event handling but exits immediately and disables vm_event, t= he situation becomes racey as disabling vm_event may succeed before the guest's vCPUs get scheduled with the pending emulation task. This has been particul= arly the case with VMS that have several vCPUs as after the VM is unpaused it may actually take a long time before all vCPUs get scheduled. In this patch we are reverting the default behavior to always perform emula= tion of register write events when the event occurs. To postpone them can be tur= ned on as an option. In that case the user of the interface still has to take c= are of only disabling the interface when its safe as it remains buggy. Fixes: 96760e2fba10 ('vm_event: deny register writes if refused by vm_event reply'). Signed-off-by: Tamas K Lengyel Acked-by: Jan Beulich Reviewed-by: Roger Pau Monn=C3=A9 --- xen/arch/x86/hvm/hvm.c | 14 ++++++++------ xen/arch/x86/hvm/monitor.c | 13 ++++++++----- xen/include/asm-x86/domain.h | 1 + xen/include/asm-x86/hvm/monitor.h | 7 +++---- xen/include/asm-x86/monitor.h | 4 ++++ xen/include/public/domctl.h | 6 ++++++ 6 files changed, 30 insertions(+), 15 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 74c9f84462..5bb47583b3 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3601,13 +3601,15 @@ int hvm_msr_write_intercept(unsigned int msr, uint6= 4_t msr_content, =20 ASSERT(v->arch.vm_event); =20 - /* The actual write will occur in hvm_do_resume() (if permitted). = */ - v->arch.vm_event->write_data.do_write.msr =3D 1; - v->arch.vm_event->write_data.msr =3D msr; - v->arch.vm_event->write_data.value =3D msr_content; + if ( hvm_monitor_msr(msr, msr_content, msr_old_content) ) + { + /* The actual write will occur in hvm_do_resume(), if permitte= d. */ + v->arch.vm_event->write_data.do_write.msr =3D 1; + v->arch.vm_event->write_data.msr =3D msr; + v->arch.vm_event->write_data.value =3D msr_content; =20 - hvm_monitor_msr(msr, msr_content, msr_old_content); - return X86EMUL_OKAY; + return X86EMUL_OKAY; + } } =20 if ( (ret =3D guest_wrmsr(v, msr, msr_content)) !=3D X86EMUL_UNHANDLEA= BLE ) diff --git a/xen/arch/x86/hvm/monitor.c b/xen/arch/x86/hvm/monitor.c index 8aa14137e2..e4a09964a0 100644 --- a/xen/arch/x86/hvm/monitor.c +++ b/xen/arch/x86/hvm/monitor.c @@ -53,11 +53,11 @@ bool hvm_monitor_cr(unsigned int index, unsigned long v= alue, unsigned long old) .u.write_ctrlreg.old_value =3D old }; =20 - if ( monitor_traps(curr, sync, &req) >=3D 0 ) - return 1; + return monitor_traps(curr, sync, &req) >=3D 0 && + curr->domain->arch.monitor.control_register_values; } =20 - return 0; + return false; } =20 bool hvm_monitor_emul_unimplemented(void) @@ -77,7 +77,7 @@ bool hvm_monitor_emul_unimplemented(void) monitor_traps(curr, true, &req) =3D=3D 1; } =20 -void hvm_monitor_msr(unsigned int msr, uint64_t new_value, uint64_t old_va= lue) +bool hvm_monitor_msr(unsigned int msr, uint64_t new_value, uint64_t old_va= lue) { struct vcpu *curr =3D current; =20 @@ -92,8 +92,11 @@ void hvm_monitor_msr(unsigned int msr, uint64_t new_valu= e, uint64_t old_value) .u.mov_to_msr.old_value =3D old_value }; =20 - monitor_traps(curr, 1, &req); + return monitor_traps(curr, 1, &req) >=3D 0 && + curr->domain->arch.monitor.control_register_values; } + + return false; } =20 void hvm_monitor_descriptor_access(uint64_t exit_info, diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h index e8cee3d5e5..6fd94c2e14 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -418,6 +418,7 @@ struct arch_domain * This is used to filter out pagefaults. */ unsigned int inguest_pagefault_disabled = : 1; + unsigned int control_register_values = : 1; struct monitor_msr_bitmap *msr_bitmap; uint64_t write_ctrlreg_mask[4]; } monitor; diff --git a/xen/include/asm-x86/hvm/monitor.h b/xen/include/asm-x86/hvm/mo= nitor.h index 66de24cb75..a75cd8545c 100644 --- a/xen/include/asm-x86/hvm/monitor.h +++ b/xen/include/asm-x86/hvm/monitor.h @@ -29,15 +29,14 @@ enum hvm_monitor_debug_type }; =20 /* - * Called for current VCPU on crX/MSR changes by guest. - * The event might not fire if the client has subscribed to it in onchange= only - * mode, hence the bool return type for control register write events. + * Called for current VCPU on crX/MSR changes by guest. Bool return signals + * whether emulation should be postponed. */ bool hvm_monitor_cr(unsigned int index, unsigned long value, unsigned long old); #define hvm_monitor_crX(cr, new, old) \ hvm_monitor_cr(VM_EVENT_X86_##cr, new, old) -void hvm_monitor_msr(unsigned int msr, uint64_t value, uint64_t old_value); +bool hvm_monitor_msr(unsigned int msr, uint64_t value, uint64_t old_value); void hvm_monitor_descriptor_access(uint64_t exit_info, uint64_t vmx_exit_qualification, uint8_t descriptor, bool is_write); diff --git a/xen/include/asm-x86/monitor.h b/xen/include/asm-x86/monitor.h index 4afb0665e8..01c6d63bb9 100644 --- a/xen/include/asm-x86/monitor.h +++ b/xen/include/asm-x86/monitor.h @@ -59,6 +59,10 @@ int arch_monitor_domctl_op(struct domain *d, struct xen_= domctl_monitor_op *mop) domain_unpause(d); break; =20 + case XEN_DOMCTL_MONITOR_OP_CONTROL_REGISTERS: + d->arch.monitor.control_register_values =3D true; + break; + default: rc =3D -EOPNOTSUPP; } diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index 1ad34c35eb..59bdc28c89 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -1025,6 +1025,12 @@ struct xen_domctl_psr_cmt_op { #define XEN_DOMCTL_MONITOR_OP_DISABLE 1 #define XEN_DOMCTL_MONITOR_OP_GET_CAPABILITIES 2 #define XEN_DOMCTL_MONITOR_OP_EMULATE_EACH_REP 3 +/* + * Control register feature can result in guest-crashes when the monitor + * subsystem is being turned off. User has to take special precautions + * to ensure all vCPUs have resumed before it is safe to turn it off. + */ +#define XEN_DOMCTL_MONITOR_OP_CONTROL_REGISTERS 4 =20 #define XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG 0 #define XEN_DOMCTL_MONITOR_EVENT_MOV_TO_MSR 1 --=20 2.25.1