From nobody Thu Jan 1 07:19:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98B5FC25B47 for ; Wed, 25 Oct 2023 05:59:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231801AbjJYF73 (ORCPT ); Wed, 25 Oct 2023 01:59:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229498AbjJYF71 (ORCPT ); Wed, 25 Oct 2023 01:59:27 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76D19AC; Tue, 24 Oct 2023 22:59:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698213565; x=1729749565; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zpVz5IHUMxsDVwAwpqXhPcBg+mXIGlfvFTkmBpX+v7I=; b=aIW7aXWCNHeeSCuuPsF/tZe4oZzEiE0Wr2EjSX1knQifozR0U8uHj/uZ bmDJIU9IHnecvXF0Vg1yvQ1wyzb593DtizUWutKEx849myZFht8kCtibK wId5hFicdIBE5XYZWVza4yLH+/02qrvnsuyd9nAcckSw7xnsHjuEfIe21 XImHiFY2cKWPV6zJlBOfDj1vlfeesKuyTEwh6Ia4VfM4rakt1aT8GI+AB wCrFe8OHV7166IhbIydy9Hy6U3KyxOm0o0wObwU06t6HTI5372PSwD0v2 cFvYt63u3+xmW36Y3oWmtsc8dqAP9MPkuYkhyn38NpzflzGhBB76II5Fj A==; X-IronPort-AV: E=McAfee;i="6600,9927,10873"; a="473479226" X-IronPort-AV: E=Sophos;i="6.03,249,1694761200"; d="scan'208";a="473479226" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Oct 2023 22:59:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10873"; a="788021758" X-IronPort-AV: E=Sophos;i="6.03,249,1694761200"; d="scan'208";a="788021758" Received: from lxy-clx-4s.sh.intel.com ([10.239.48.52]) by orsmga008.jf.intel.com with ESMTP; 24 Oct 2023 22:59:21 -0700 From: Xiaoyao Li To: Paolo Bonzini , Sean Christopherson , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen Cc: Jonathan Corbet , Wanpeng Li , Vitaly Kuznetsov , x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Xiaoyao Li Subject: [PATCH v2 1/2] x86/kvm/async_pf: Use separate percpu variable to track the enabling of asyncpf Date: Wed, 25 Oct 2023 01:59:13 -0400 Message-Id: <20231025055914.1201792-2-xiaoyao.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231025055914.1201792-1-xiaoyao.li@intel.com> References: <20231025055914.1201792-1-xiaoyao.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Refer to commit fd10cde9294f ("KVM paravirt: Add async PF initialization to PV guest") and commit 344d9588a9df ("KVM: Add PV MSR to enable asynchronous page faults delivery"). It turns out that at the time when asyncpf was introduced, the purpose was defining the shared PV data 'struct kvm_vcpu_pv_apf_data' with the size of 64 bytes. However, it made a mistake and defined the size to 68 bytes, which failed to make fit in a cache line and made the code inconsistent with the documentation. Below justification quoted from Sean[*] KVM (the host side) has *never* read kvm_vcpu_pv_apf_data.enabled, and the documentation clearly states that enabling is based solely on the bit in the synthetic MSR. So rather than update the documentation, fix the goof by removing the enabled filed and use the separate percpu variable instread. KVM-as-a-host obviously doesn't enforce anything or consume the size, and changing the header will only affect guests that are rebuilt against the new header, so there's no chance of ABI breakage between KVM and its guests. The only possible breakage is if some other hypervisor is emulating KVM's async #PF (LOL) and relies on the guest to set kvm_vcpu_pv_apf_data.enabled. But (a) I highly doubt such a hypervisor exists, (b) that would arguably be a violation of KVM's "spec", and (c) the worst case scenario is that the guest would simply lose async #PF functionality. [*] https://lore.kernel.org/all/ZS7ERnnRqs8Fl0ZF@google.com/T/#u Suggested-by: Sean Christopherson Signed-off-by: Xiaoyao Li --- Documentation/virt/kvm/x86/msr.rst | 1 - arch/x86/include/uapi/asm/kvm_para.h | 1 - arch/x86/kernel/kvm.c | 11 ++++++----- 3 files changed, 6 insertions(+), 7 deletions(-) diff --git a/Documentation/virt/kvm/x86/msr.rst b/Documentation/virt/kvm/x8= 6/msr.rst index 9315fc385fb0..f6d70f99a1a7 100644 --- a/Documentation/virt/kvm/x86/msr.rst +++ b/Documentation/virt/kvm/x86/msr.rst @@ -204,7 +204,6 @@ data: __u32 token; =20 __u8 pad[56]; - __u32 enabled; }; =20 Bits 5-4 of the MSR are reserved and should be zero. Bit 0 is set to 1 diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/a= sm/kvm_para.h index 6e64b27b2c1e..605899594ebb 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -142,7 +142,6 @@ struct kvm_vcpu_pv_apf_data { __u32 token; =20 __u8 pad[56]; - __u32 enabled; }; =20 #define KVM_PV_EOI_BIT 0 diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index b8ab9ee5896c..388a3fdd3cad 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -65,6 +65,7 @@ static int __init parse_no_stealacc(char *arg) =20 early_param("no-steal-acc", parse_no_stealacc); =20 +static DEFINE_PER_CPU_READ_MOSTLY(bool, async_pf_enabled); static DEFINE_PER_CPU_DECRYPTED(struct kvm_vcpu_pv_apf_data, apf_reason) _= _aligned(64); DEFINE_PER_CPU_DECRYPTED(struct kvm_steal_time, steal_time) __aligned(64) = __visible; static int has_steal_clock =3D 0; @@ -244,7 +245,7 @@ noinstr u32 kvm_read_and_reset_apf_flags(void) { u32 flags =3D 0; =20 - if (__this_cpu_read(apf_reason.enabled)) { + if (__this_cpu_read(async_pf_enabled)) { flags =3D __this_cpu_read(apf_reason.flags); __this_cpu_write(apf_reason.flags, 0); } @@ -295,7 +296,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_kvm_asyncpf_interrupt) =20 inc_irq_stat(irq_hv_callback_count); =20 - if (__this_cpu_read(apf_reason.enabled)) { + if (__this_cpu_read(async_pf_enabled)) { token =3D __this_cpu_read(apf_reason.token); kvm_async_pf_task_wake(token); __this_cpu_write(apf_reason.token, 0); @@ -362,7 +363,7 @@ static void kvm_guest_cpu_init(void) wrmsrl(MSR_KVM_ASYNC_PF_INT, HYPERVISOR_CALLBACK_VECTOR); =20 wrmsrl(MSR_KVM_ASYNC_PF_EN, pa); - __this_cpu_write(apf_reason.enabled, 1); + __this_cpu_write(async_pf_enabled, 1); pr_debug("setup async PF for cpu %d\n", smp_processor_id()); } =20 @@ -383,11 +384,11 @@ static void kvm_guest_cpu_init(void) =20 static void kvm_pv_disable_apf(void) { - if (!__this_cpu_read(apf_reason.enabled)) + if (!__this_cpu_read(async_pf_enabled)) return; =20 wrmsrl(MSR_KVM_ASYNC_PF_EN, 0); - __this_cpu_write(apf_reason.enabled, 0); + __this_cpu_write(async_pf_enabled, 0); =20 pr_debug("disable async PF for cpu %d\n", smp_processor_id()); } --=20 2.34.1 From nobody Thu Jan 1 07:19:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92F2FC25B6B for ; Wed, 25 Oct 2023 05:59:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232149AbjJYF7g (ORCPT ); Wed, 25 Oct 2023 01:59:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231254AbjJYF7b (ORCPT ); Wed, 25 Oct 2023 01:59:31 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2B16130; Tue, 24 Oct 2023 22:59:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698213568; x=1729749568; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jQJiiFvpOMKFbrXXC4RAupxJWvxclI+9fI31oaetUbg=; b=Y8ud/hFcpiKS7W12OPz8RMn7+QCHv/VFToNTe3miWiIibEW2xQANxJZL cmI4Y8CGgUa8qk0Cq6mnsTq41D09BIDiwF9i9VqjjJnpwx+j2rBbWSEub HG0KKph8R3gk8xb1kwQOa1naIJRp5x5htweqtSV+LwtMLQe512oPOj9oI 4b8BHnNlXA8N3S9sCiD/pKdtFKXwShRphymCXEvTOUYSdjeyI5K9b21R/ ALdFsX3NKaktpFlflbJMEbemGdQUkvFGUQdiHNnOrVMZWJZYKvzwJlJj6 4800+FoeHB0wuIzLZHlGsLsN3OCny9YBQnpvl9wOkqHgAgCzWSwF82mS7 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10873"; a="473479250" X-IronPort-AV: E=Sophos;i="6.03,249,1694761200"; d="scan'208";a="473479250" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Oct 2023 22:59:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10873"; a="788021763" X-IronPort-AV: E=Sophos;i="6.03,249,1694761200"; d="scan'208";a="788021763" Received: from lxy-clx-4s.sh.intel.com ([10.239.48.52]) by orsmga008.jf.intel.com with ESMTP; 24 Oct 2023 22:59:25 -0700 From: Xiaoyao Li To: Paolo Bonzini , Sean Christopherson , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen Cc: Jonathan Corbet , Wanpeng Li , Vitaly Kuznetsov , x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Xiaoyao Li Subject: [PATCH v2 2/2] KVM: x86: Improve documentation of MSR_KVM_ASYNC_PF_EN Date: Wed, 25 Oct 2023 01:59:14 -0400 Message-Id: <20231025055914.1201792-3-xiaoyao.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231025055914.1201792-1-xiaoyao.li@intel.com> References: <20231025055914.1201792-1-xiaoyao.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Fix some incorrect statement of MSR_KVM_ASYNC_PF_EN documentation and state clearly the token in 'struct kvm_vcpu_pv_apf_data' of 'page ready' event is matchted with the token in CR2 in 'page not present' event. Signed-off-by: Xiaoyao Li --- Documentation/virt/kvm/x86/msr.rst | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/Documentation/virt/kvm/x86/msr.rst b/Documentation/virt/kvm/x8= 6/msr.rst index f6d70f99a1a7..3aecf2a70e7b 100644 --- a/Documentation/virt/kvm/x86/msr.rst +++ b/Documentation/virt/kvm/x86/msr.rst @@ -193,8 +193,8 @@ data: Asynchronous page fault (APF) control MSR. =20 Bits 63-6 hold 64-byte aligned physical address of a 64 byte memory area - which must be in guest RAM and must be zeroed. This memory is expected - to hold a copy of the following structure:: + which must be in guest RAM. This memory is expected to hold the + following structure:: =20 struct kvm_vcpu_pv_apf_data { /* Used for 'page not present' events delivered via #PF */ @@ -231,14 +231,14 @@ data: as regular page fault, guest must reset 'flags' to '0' before it does something that can generate normal page fault. =20 - Bytes 5-7 of 64 byte memory location ('token') will be written to by the + Bytes 4-7 of 64 byte memory location ('token') will be written to by the hypervisor at the time of APF 'page ready' event injection. The content - of these bytes is a token which was previously delivered as 'page not - present' event. The event indicates the page in now available. Guest is - supposed to write '0' to 'token' when it is done handling 'page ready' - event and to write 1' to MSR_KVM_ASYNC_PF_ACK after clearing the location; - writing to the MSR forces KVM to re-scan its queue and deliver the next - pending notification. + of these bytes is a token which was previously delivered in CR2 as + 'page not present' event. The event indicates the page is now available. + Guest is supposed to write '0' to 'token' when it is done handling + 'page ready' event and to write '1' to MSR_KVM_ASYNC_PF_ACK after + clearing the location; writing to the MSR forces KVM to re-scan its + queue and deliver the next pending notification. =20 Note, MSR_KVM_ASYNC_PF_INT MSR specifying the interrupt vector for 'page ready' APF delivery needs to be written to before enabling APF mechanism --=20 2.34.1