From nobody Sat Apr 18 07:42:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFBC9C433EF for ; Fri, 15 Jul 2022 16:00:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232400AbiGOQAz (ORCPT ); Fri, 15 Jul 2022 12:00:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231555AbiGOQAt (ORCPT ); Fri, 15 Jul 2022 12:00:49 -0400 Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B62EA6C120 for ; Fri, 15 Jul 2022 09:00:48 -0700 (PDT) Received: by mail-lf1-x12a.google.com with SMTP id bf9so8470466lfb.13 for ; Fri, 15 Jul 2022 09:00:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DgAPxu68EVPkc4OcXPg6lrRhMAAc/3X1Z7wfY7wDJ3U=; b=c3pigEntsf04zOudkz13PWuERnOm3JGaha7ZtSo12Gx0iyVKVLLJh3nx2XmwUpW9Qk IFJ+8W8MojdtRDFQzuXoiNwWT4R1e2Q88A3f6VJTTEo4tdP+6DNhfXtXntUSFcSCEmrj SSaFXTA4JJk1KIL0oWbMeJKCB4i1hirIOMoJDDvKNim6/L6oHM7/J3Rfu651Z+cL92AX TY9gwcdrrmo5QKFS0T60fL+PQVyflyCqwHDEtTrb2RQVviPUP6pk1W+Sd47Oo8XrbPxZ oviuJkuZVCskdPfHbtvpXN2jKncLjlXyZ/fNL8uhpS4yYc2j0oLHRSP8wDmeT3099Yot LeBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DgAPxu68EVPkc4OcXPg6lrRhMAAc/3X1Z7wfY7wDJ3U=; b=rCBgWtUDmyywatJ+NlvFTiOGkUpjyQKEsLymrVOLUdaIZvYkId2Ur/tcgC6EQ9dZMW mwnpsrSz8ED3nERqGiKzBNMRUG5unk94B15/jPPD0kb/ofyCHuUXZsZj/FoN8WovDgdu vGN0xh4mXpwkHXsQoFbThQuoO1N1Df430mlNA6S39FAzZJY5dVfaT2+xJcQB0P5CK6GN 2C6/E/6QUNyhq7zSp07+QXwyJvCQEIALK+PC++1J305PvsUOQ+kx6jhXmFOxFoSFGR9T EeZPceM9DNdzfLZ0s0g+4lD+juKnPfTBonFTgmyqfDraO+CRMD5QturVDJ0ryP+VyqfY OXyQ== X-Gm-Message-State: AJIora/9QWDcCibF4byJKxKtjj2lPT/j3diCsiLYpg4JlHWDLqWQ62PX 2GW9IquiuqoZRSqVIfCX2EMqGg== X-Google-Smtp-Source: AGRyM1sbzYJhb18o98f9MdFFNNzGM12AOEM2tI0+vol9eP8w3ZZxCYT6S++sWpvSZ4Fj5mVDm4ghnw== X-Received: by 2002:a05:6512:1096:b0:489:cbad:de4f with SMTP id j22-20020a056512109600b00489cbadde4fmr8499188lfg.164.1657900847021; Fri, 15 Jul 2022 09:00:47 -0700 (PDT) Received: from dmaluka.office.semihalf.net ([83.142.187.84]) by smtp.gmail.com with ESMTPSA id c12-20020a056512238c00b0047968606114sm959772lfv.111.2022.07.15.09.00.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jul 2022 09:00:46 -0700 (PDT) From: Dmytro Maluka To: Sean Christopherson , Paolo Bonzini , kvm@vger.kernel.org Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , linux-kernel@vger.kernel.org, Eric Auger , Alex Williamson , Rong L Liu , Zhenyu Wang , Tomasz Nowicki , Grzegorz Jaszczyk , Dmitry Torokhov , Dmytro Maluka Subject: [PATCH 1/3] KVM: x86: Move kvm_(un)register_irq_mask_notifier() to generic KVM Date: Fri, 15 Jul 2022 17:59:26 +0200 Message-Id: <20220715155928.26362-2-dmy@semihalf.com> X-Mailer: git-send-email 2.37.0.170.g444d1eabd0-goog In-Reply-To: <20220715155928.26362-1-dmy@semihalf.com> References: <20220715155928.26362-1-dmy@semihalf.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In preparation for implementing postponing resamplefd event until the interrupt is unmasked, move kvm_(un)register_irq_mask_notifier() from x86 to arch-independent code to make it usable by irqfd. Note that calling mask notifiers is still implemented for x86 only, so registering mask notifiers on non-x86 will have no effect. Link: https://lore.kernel.org/kvm/31420943-8c5f-125c-a5ee-d2fde2700083@semi= half.com/ Signed-off-by: Dmytro Maluka --- arch/x86/include/asm/kvm_host.h | 10 ---------- arch/x86/kvm/irq_comm.c | 18 ------------------ include/linux/kvm_host.h | 10 ++++++++++ virt/kvm/eventfd.c | 18 ++++++++++++++++++ 4 files changed, 28 insertions(+), 28 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 9217bd6cf0d1..39a867d68721 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1688,16 +1688,6 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long= cr3); int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, const void *val, int bytes); =20 -struct kvm_irq_mask_notifier { - void (*func)(struct kvm_irq_mask_notifier *kimn, bool masked); - int irq; - struct hlist_node link; -}; - -void kvm_register_irq_mask_notifier(struct kvm *kvm, int irq, - struct kvm_irq_mask_notifier *kimn); -void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq, - struct kvm_irq_mask_notifier *kimn); void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned p= in, bool mask); =20 diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c index 0687162c4f22..43e13892ed34 100644 --- a/arch/x86/kvm/irq_comm.c +++ b/arch/x86/kvm/irq_comm.c @@ -234,24 +234,6 @@ void kvm_free_irq_source_id(struct kvm *kvm, int irq_s= ource_id) mutex_unlock(&kvm->irq_lock); } =20 -void kvm_register_irq_mask_notifier(struct kvm *kvm, int irq, - struct kvm_irq_mask_notifier *kimn) -{ - mutex_lock(&kvm->irq_lock); - kimn->irq =3D irq; - hlist_add_head_rcu(&kimn->link, &kvm->arch.mask_notifier_list); - mutex_unlock(&kvm->irq_lock); -} - -void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq, - struct kvm_irq_mask_notifier *kimn) -{ - mutex_lock(&kvm->irq_lock); - hlist_del_rcu(&kimn->link); - mutex_unlock(&kvm->irq_lock); - synchronize_srcu(&kvm->irq_srcu); -} - void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned p= in, bool mask) { diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 90a45ef7203b..9e12ef503157 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1581,6 +1581,12 @@ struct kvm_irq_ack_notifier { void (*irq_acked)(struct kvm_irq_ack_notifier *kian); }; =20 +struct kvm_irq_mask_notifier { + void (*func)(struct kvm_irq_mask_notifier *kimn, bool masked); + int irq; + struct hlist_node link; +}; + int kvm_irq_map_gsi(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *entries, int gsi); int kvm_irq_map_chip_pin(struct kvm *kvm, unsigned irqchip, unsigned pin); @@ -1599,6 +1605,10 @@ void kvm_register_irq_ack_notifier(struct kvm *kvm, struct kvm_irq_ack_notifier *kian); void kvm_unregister_irq_ack_notifier(struct kvm *kvm, struct kvm_irq_ack_notifier *kian); +void kvm_register_irq_mask_notifier(struct kvm *kvm, int irq, + struct kvm_irq_mask_notifier *kimn); +void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq, + struct kvm_irq_mask_notifier *kimn); int kvm_request_irq_source_id(struct kvm *kvm); void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id); bool kvm_arch_irqfd_allowed(struct kvm *kvm, struct kvm_irqfd *args); diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 2a3ed401ce46..50ddb1d1a7f0 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -520,6 +520,24 @@ void kvm_unregister_irq_ack_notifier(struct kvm *kvm, } #endif =20 +void kvm_register_irq_mask_notifier(struct kvm *kvm, int irq, + struct kvm_irq_mask_notifier *kimn) +{ + mutex_lock(&kvm->irq_lock); + kimn->irq =3D irq; + hlist_add_head_rcu(&kimn->link, &kvm->arch.mask_notifier_list); + mutex_unlock(&kvm->irq_lock); +} + +void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq, + struct kvm_irq_mask_notifier *kimn) +{ + mutex_lock(&kvm->irq_lock); + hlist_del_rcu(&kimn->link); + mutex_unlock(&kvm->irq_lock); + synchronize_srcu(&kvm->irq_srcu); +} + void kvm_eventfd_init(struct kvm *kvm) { --=20 2.37.0.170.g444d1eabd0-goog From nobody Sat Apr 18 07:42:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9628CCCA480 for ; Fri, 15 Jul 2022 16:01:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231555AbiGOQBC (ORCPT ); Fri, 15 Jul 2022 12:01:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229538AbiGOQA6 (ORCPT ); Fri, 15 Jul 2022 12:00:58 -0400 Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 027B56D2D2 for ; Fri, 15 Jul 2022 09:00:52 -0700 (PDT) Received: by mail-lj1-x22c.google.com with SMTP id 19so6188834ljz.4 for ; Fri, 15 Jul 2022 09:00:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6RtXdad8bI5IetWDi9IC5OTbVeT+2bjoYNO4DTxrHCg=; b=BEqZKxOg8Dh/1FkqEpVfdQXyIfT29i9dbmybPJhgSA98masA3HJ+LxGE6dY1PwF6UN WLV2uqD34to5ai6Bwcj3G01rIcheVIMuzBkRYpsFU0gjR6fXDvhog3hXZEqh1el+DYDN 4H7SgY6kn7FyWFRj4QN8fr9l+1bf9yL5jywpjE6qbOQwGOGqnN1cwHNJrnVGcVnq++ab 06dWi7g/3tHFb1UEk34ekcUx2SVgQeqz0O9UP9pgNWA43ZNOJ/nLNG9EqU8V6uwIJowN 5ekK8nI4g3nNrNY1a0JnAadJAc0jzAtiGHytaq6X0KjfNTQpylLIIzsleCilYhdFV/Qq HqRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6RtXdad8bI5IetWDi9IC5OTbVeT+2bjoYNO4DTxrHCg=; b=7ypKqjsJ5Jum46Lk5B5ZY8jO0Zz2a6j29yaIQogkC4/XlIPuO9ZwlwiiLCDOg63FD/ qcd6iYxxPMuZyQ2ydlY8t4XaX3ocxK8YdiPXwphDcIzFCac/XoF7n4JAya/1ZecNBlCd jlcki8AC+sxddzTQSOV919NjZ+1YE4gohlTrpMM3LEgQynzStGAQvypEHipgVy7mEBZ+ IJFXT9jf5WAfusu32i1sFh7XdZSn3cK+/Sk7ih9w262eyxdEzxjFqKVQJka9DtZ85KFM dXSwT2z+dGhao6QorjG0bJk1fvMB7bfB3yrAwdRFhx/rSLG/t2vTbxiRx7rowHP+eRfc eSAQ== X-Gm-Message-State: AJIora8QRxF3V6ZYGe0x21tCQbglTDlwoCzaIfN0k9wfY/OjT7yD378r cB+K9ZW2AzomT1fYSVOtYS/c9w== X-Google-Smtp-Source: AGRyM1soE7e6xBdX8jq4oPod+4oeZbKokXbi6NZnEMJnsmzcHaYRwqhZvvu5K6duNxNWOrjvT2IpdA== X-Received: by 2002:a2e:b6d2:0:b0:25d:6849:f7f9 with SMTP id m18-20020a2eb6d2000000b0025d6849f7f9mr7397974ljo.41.1657900850188; Fri, 15 Jul 2022 09:00:50 -0700 (PDT) Received: from dmaluka.office.semihalf.net ([83.142.187.84]) by smtp.gmail.com with ESMTPSA id c12-20020a056512238c00b0047968606114sm959772lfv.111.2022.07.15.09.00.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jul 2022 09:00:48 -0700 (PDT) From: Dmytro Maluka To: Sean Christopherson , Paolo Bonzini , kvm@vger.kernel.org Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , linux-kernel@vger.kernel.org, Eric Auger , Alex Williamson , Rong L Liu , Zhenyu Wang , Tomasz Nowicki , Grzegorz Jaszczyk , Dmitry Torokhov , Dmytro Maluka Subject: [PATCH 2/3] KVM: x86: Add kvm_irq_is_masked() Date: Fri, 15 Jul 2022 17:59:27 +0200 Message-Id: <20220715155928.26362-3-dmy@semihalf.com> X-Mailer: git-send-email 2.37.0.170.g444d1eabd0-goog In-Reply-To: <20220715155928.26362-1-dmy@semihalf.com> References: <20220715155928.26362-1-dmy@semihalf.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In order to implement postponing resamplefd event until an interrupt is unmasked, we need not only to track changes of the interrupt mask state (which is made possible by the previous patch "KVM: x86: Move kvm_(un)register_irq_mask_notifier() to generic KVM") but also to know its initial mask state at the time of registering a resamplefd listener. So implement kvm_irq_is_masked() for that. Actually, for now it's implemented for x86 only (see below). The implementation is trickier than I'd like it to be, for 2 reasons: 1. Interrupt (GSI) to irqchip pin mapping is not a 1:1 mapping: an IRQ may map to multiple pins on different irqchips. I guess the only reason for that is to support x86 interrupts 0-15 for which we don't know if the guest uses PIC or IOAPIC. For this reason kvm_set_irq() delivers interrupt to both, assuming the guest will ignore the unused one. For the same reason, in kvm_irq_is_masked() we should also take into account the mask state of both irqchips. We consider an interrupt unmasked if and only if it is unmasked in at least one of PIC or IOAPIC, assuming that in the unused one all the interrupts should be masked. 2. For now ->is_masked() implemented for x86 only, so need to handle the case when ->is_masked() is not provided by the irqchip. In such case kvm_irq_is_masked() returns failure, and its caller may fall back to an assumption that an interrupt is always unmasked. Link: https://lore.kernel.org/kvm/31420943-8c5f-125c-a5ee-d2fde2700083@semi= half.com/ Signed-off-by: Dmytro Maluka --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/i8259.c | 11 +++++++++++ arch/x86/kvm/ioapic.c | 11 +++++++++++ arch/x86/kvm/ioapic.h | 1 + arch/x86/kvm/irq_comm.c | 16 ++++++++++++++++ include/linux/kvm_host.h | 3 +++ virt/kvm/irqchip.c | 34 +++++++++++++++++++++++++++++++++ 7 files changed, 77 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 39a867d68721..64618b890700 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1840,6 +1840,7 @@ static inline int __kvm_irq_line_state(unsigned long = *irq_state, =20 int kvm_pic_set_irq(struct kvm_pic *pic, int irq, int irq_source_id, int l= evel); void kvm_pic_clear_all(struct kvm_pic *pic, int irq_source_id); +bool kvm_pic_irq_is_masked(struct kvm_pic *pic, int irq); =20 void kvm_inject_nmi(struct kvm_vcpu *vcpu); =20 diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c index e1bb6218bb96..2d1ed3bc7cc5 100644 --- a/arch/x86/kvm/i8259.c +++ b/arch/x86/kvm/i8259.c @@ -211,6 +211,17 @@ void kvm_pic_clear_all(struct kvm_pic *s, int irq_sour= ce_id) pic_unlock(s); } =20 +bool kvm_pic_irq_is_masked(struct kvm_pic *s, int irq) +{ + bool ret; + + pic_lock(s); + ret =3D !!(s->pics[irq >> 3].imr & (1 << irq)); + pic_unlock(s); + + return ret; +} + /* * acknowledge interrupt 'irq' */ diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c index 765943d7cfa5..874f68a65c87 100644 --- a/arch/x86/kvm/ioapic.c +++ b/arch/x86/kvm/ioapic.c @@ -478,6 +478,17 @@ void kvm_ioapic_clear_all(struct kvm_ioapic *ioapic, i= nt irq_source_id) spin_unlock(&ioapic->lock); } =20 +bool kvm_ioapic_irq_is_masked(struct kvm_ioapic *ioapic, int irq) +{ + bool ret; + + spin_lock(&ioapic->lock); + ret =3D !!ioapic->redirtbl[irq].fields.mask; + spin_unlock(&ioapic->lock); + + return ret; +} + static void kvm_ioapic_eoi_inject_work(struct work_struct *work) { int i; diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h index 539333ac4b38..fe1f51319992 100644 --- a/arch/x86/kvm/ioapic.h +++ b/arch/x86/kvm/ioapic.h @@ -114,6 +114,7 @@ void kvm_ioapic_destroy(struct kvm *kvm); int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_= id, int level, bool line_status); void kvm_ioapic_clear_all(struct kvm_ioapic *ioapic, int irq_source_id); +bool kvm_ioapic_irq_is_masked(struct kvm_ioapic *ioapic, int irq); void kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state); void kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state); void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c index 43e13892ed34..5bff6d6ac54f 100644 --- a/arch/x86/kvm/irq_comm.c +++ b/arch/x86/kvm/irq_comm.c @@ -34,6 +34,13 @@ static int kvm_set_pic_irq(struct kvm_kernel_irq_routing= _entry *e, return kvm_pic_set_irq(pic, e->irqchip.pin, irq_source_id, level); } =20 +static bool kvm_is_masked_pic_irq(struct kvm_kernel_irq_routing_entry *e, + struct kvm *kvm) +{ + struct kvm_pic *pic =3D kvm->arch.vpic; + return kvm_pic_irq_is_masked(pic, e->irqchip.pin); +} + static int kvm_set_ioapic_irq(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm, int irq_source_id, int level, bool line_status) @@ -43,6 +50,13 @@ static int kvm_set_ioapic_irq(struct kvm_kernel_irq_rout= ing_entry *e, line_status); } =20 +static bool kvm_is_masked_ioapic_irq(struct kvm_kernel_irq_routing_entry *= e, + struct kvm *kvm) +{ + struct kvm_ioapic *ioapic =3D kvm->arch.vioapic; + return kvm_ioapic_irq_is_masked(ioapic, e->irqchip.pin); +} + int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src, struct kvm_lapic_irq *irq, struct dest_map *dest_map) { @@ -275,11 +289,13 @@ int kvm_set_routing_entry(struct kvm *kvm, if (ue->u.irqchip.pin >=3D PIC_NUM_PINS / 2) return -EINVAL; e->set =3D kvm_set_pic_irq; + e->is_masked =3D kvm_is_masked_pic_irq; break; case KVM_IRQCHIP_IOAPIC: if (ue->u.irqchip.pin >=3D KVM_IOAPIC_NUM_PINS) return -EINVAL; e->set =3D kvm_set_ioapic_irq; + e->is_masked =3D kvm_is_masked_ioapic_irq; break; default: return -EINVAL; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9e12ef503157..e8bfb3b0d4d1 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -625,6 +625,8 @@ struct kvm_kernel_irq_routing_entry { int (*set)(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm, int irq_source_id, int level, bool line_status); + bool (*is_masked)(struct kvm_kernel_irq_routing_entry *e, + struct kvm *kvm); union { struct { unsigned irqchip; @@ -1598,6 +1600,7 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *= irq_entry, struct kvm *kvm, int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm, int irq_source_id, int level, bool line_status); +int kvm_irq_is_masked(struct kvm *kvm, int irq, bool *masked); bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin); void kvm_notify_acked_gsi(struct kvm *kvm, int gsi); void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin); diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c index 58e4f88b2b9f..9252ebedba55 100644 --- a/virt/kvm/irqchip.c +++ b/virt/kvm/irqchip.c @@ -97,6 +97,40 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 = irq, int level, return ret; } =20 +/* + * Return value: + * =3D 0 Interrupt mask state successfully written to `masked` + * < 0 Failed to read interrupt mask state + */ +int kvm_irq_is_masked(struct kvm *kvm, int irq, bool *masked) +{ + struct kvm_kernel_irq_routing_entry irq_set[KVM_NR_IRQCHIPS]; + int ret =3D -1, i, idx; + + /* Not possible to detect if the guest uses the PIC or the + * IOAPIC. So assume the interrupt to be unmasked iff it is + * unmasked in at least one of both. + */ + idx =3D srcu_read_lock(&kvm->irq_srcu); + i =3D kvm_irq_map_gsi(kvm, irq_set, irq); + srcu_read_unlock(&kvm->irq_srcu, idx); + + while (i--) { + if (!irq_set[i].is_masked) + continue; + + if (!irq_set[i].is_masked(&irq_set[i], kvm)) { + *masked =3D false; + return 0; + } + ret =3D 0; + } + if (!ret) + *masked =3D true; + + return ret; +} + static void free_irq_routing_table(struct kvm_irq_routing_table *rt) { int i; --=20 2.37.0.170.g444d1eabd0-goog From nobody Sat Apr 18 07:42:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04774C43334 for ; Fri, 15 Jul 2022 16:01:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232977AbiGOQBJ (ORCPT ); Fri, 15 Jul 2022 12:01:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232745AbiGOQA7 (ORCPT ); Fri, 15 Jul 2022 12:00:59 -0400 Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 057CE6E8BB for ; Fri, 15 Jul 2022 09:00:54 -0700 (PDT) Received: by mail-lf1-x131.google.com with SMTP id t25so8514776lfg.7 for ; Fri, 15 Jul 2022 09:00:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6PvTZQ45Z0LZ1z1mI3Km4s/20uCm/f1rKAHjk57ZZk8=; b=X5YO4csQqPan1hqJ4ywB5/7ABfqBnMBowXP4ayNMlBt4V+h/PIZmINY/jCxDJDQGAm /8SOHrHLa/sVo5ZQPSnK9diHS0b2GeeZbRrRgnYp68jI5GXiNmoCfgvb6sqvoda4yLPt j+5mqOBSkDYknzg/7l5iLcSqBg9WRX4+OvstF732nLI7pTgA0rSPpfWaxF9Lam1wdSig Q+tGkEqka1o0EbJlKP4iypS8s+4kk+va5bura3M2iitnQBm6PrXyr6oLo6s7ulbJRNei 1NllHMhQBtpkc+Uzbu05HdCxQf8GnEmw8zvCz7CwhB67lKDyW+u0fGT3SwGURVrSH3AI VTrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6PvTZQ45Z0LZ1z1mI3Km4s/20uCm/f1rKAHjk57ZZk8=; b=P1gZ9app7PjoqxmVuFxpk8GZw3ghyXPzwCzIoiioaONk5HlX5yq85t2uu0flLawWBv 6Epa/s6ZbaAKNB6LFN+qGkywnRHfUDM10nDLoNIzrQKeqVbOmVP7JrWINFZ3pLbP+9ir UWlqVKSR/j2D24b205KClJ9NhpqMuZPP7oKndRCa7fhnYkHte2D4VGV50i55A5pp2Jzl lj3WnK6GQTEcsn/TmgpWtUrPtKkpQhnL4z59ctR9FrgmJSg1Nl6pPS7a886+tqKfqmLu 4lFxd9iHtLTn2zKPapwPzHVQf9r1C7tziFXaG3mN2NwZZX0DFytf6xG0j+YZ+zwKOE/a sZLQ== X-Gm-Message-State: AJIora+fTJWFZU+zVv8JrR6RySEjPIa/gCgsR8DkxLYRjOYm3jkULtgz DAeWi2bGACrz/uV1frxNveCOhw== X-Google-Smtp-Source: AGRyM1uTpv9co3mlZ+OhXk00VcEkV9xpnVsHMD0Bsl48TyFRViM/iA5Ha3jUNDd0xu0Ybg21ZkkOlQ== X-Received: by 2002:a05:6512:3d15:b0:489:d97d:8927 with SMTP id d21-20020a0565123d1500b00489d97d8927mr8657272lfv.80.1657900852352; Fri, 15 Jul 2022 09:00:52 -0700 (PDT) Received: from dmaluka.office.semihalf.net ([83.142.187.84]) by smtp.gmail.com with ESMTPSA id c12-20020a056512238c00b0047968606114sm959772lfv.111.2022.07.15.09.00.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jul 2022 09:00:51 -0700 (PDT) From: Dmytro Maluka To: Sean Christopherson , Paolo Bonzini , kvm@vger.kernel.org Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , linux-kernel@vger.kernel.org, Eric Auger , Alex Williamson , Rong L Liu , Zhenyu Wang , Tomasz Nowicki , Grzegorz Jaszczyk , Dmitry Torokhov , Dmytro Maluka Subject: [PATCH 3/3] KVM: irqfd: Postpone resamplefd notify for oneshot interrupts Date: Fri, 15 Jul 2022 17:59:28 +0200 Message-Id: <20220715155928.26362-4-dmy@semihalf.com> X-Mailer: git-send-email 2.37.0.170.g444d1eabd0-goog In-Reply-To: <20220715155928.26362-1-dmy@semihalf.com> References: <20220715155928.26362-1-dmy@semihalf.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The existing KVM mechanism for forwarding of level-triggered interrupts using resample eventfd doesn't work quite correctly in the case of interrupts that are handled in a Linux guest as oneshot interrupts (IRQF_ONESHOT). Such an interrupt is acked to the device in its threaded irq handler, i.e. later than it is acked to the interrupt controller (EOI at the end of hardirq), not earlier. Linux keeps such interrupt masked until its threaded handler finishes, to prevent the EOI from re-asserting an unacknowledged interrupt. However, with KVM + vfio (or whatever is listening on the resamplefd) we don't check that the interrupt is still masked in the guest at the moment of EOI. Resamplefd is notified regardless, so vfio prematurely unmasks the host physical IRQ, thus a new (unwanted) physical interrupt is generated in the host and queued for injection to the guest. The fact that the virtual IRQ is still masked doesn't prevent this new physical IRQ from being propagated to the guest, because: 1. It is not guaranteed that the vIRQ will remain masked by the time when vfio signals the trigger eventfd. 2. KVM marks this IRQ as pending (e.g. setting its bit in the virtual IRR register of IOAPIC on x86), so after the vIRQ is unmasked, this new pending interrupt is injected by KVM to the guest anyway. There are observed at least 2 user-visible issues caused by those extra erroneous pending interrupts for oneshot irq in the guest: 1. System suspend aborted due to a pending wakeup interrupt from ChromeOS EC (drivers/platform/chrome/cros_ec.c). 2. Annoying "invalid report id data" errors from ELAN0000 touchpad (drivers/input/mouse/elan_i2c_core.c), flooding the guest dmesg every time the touchpad is touched. This patch fixes the issue on x86 by checking if the interrupt is unmasked when we receive irq ack (EOI) and, in case if it's masked, postponing resamplefd notify until the guest unmasks it. Important notes: 1. It doesn't fix the issue for other archs yet, due to some missing KVM functionality needed by this patch: - calling mask notifiers is implemented for x86 only - irqchip ->is_masked() is implemented for x86 only 2. It introduces an additional spinlock locking in the resample notify path, since we are no longer just traversing an RCU list of irqfds but also updating the resampler state. Hopefully this locking won't noticeably slow down anything for anyone. Regarding #2, there may be an alternative solution worth considering: extend KVM irqfd (userspace) API to send mask and unmask notifications directly to vfio/whatever, in addition to resample notifications, to let vfio check the irq state on its own. There is already locking on vfio side (see e.g. vfio_platform_unmask()), so this way we would avoid introducing any additional locking. Also such mask/unmask notifications could be useful for other cases. Link: https://lore.kernel.org/kvm/31420943-8c5f-125c-a5ee-d2fde2700083@semi= half.com/ Suggested-by: Sean Christopherson Signed-off-by: Dmytro Maluka --- include/linux/kvm_irqfd.h | 14 ++++++++++++ virt/kvm/eventfd.c | 45 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+) diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h index dac047abdba7..01754a1abb9e 100644 --- a/include/linux/kvm_irqfd.h +++ b/include/linux/kvm_irqfd.h @@ -19,6 +19,16 @@ * resamplefd. All resamplers on the same gsi are de-asserted * together, so we don't need to track the state of each individual * user. We can also therefore share the same irq source ID. + * + * A special case is when the interrupt is still masked at the moment + * an irq ack is received. That likely means that the interrupt has + * been acknowledged to the interrupt controller but not acknowledged + * to the device yet, e.g. it might be a Linux guest's threaded + * oneshot interrupt (IRQF_ONESHOT). In this case notifying through + * resamplefd is postponed until the guest unmasks the interrupt, + * which is detected through the irq mask notifier. This prevents + * erroneous extra interrupts caused by premature re-assert of an + * unacknowledged interrupt by the resamplefd listener. */ struct kvm_kernel_irqfd_resampler { struct kvm *kvm; @@ -28,6 +38,10 @@ struct kvm_kernel_irqfd_resampler { */ struct list_head list; struct kvm_irq_ack_notifier notifier; + struct kvm_irq_mask_notifier mask_notifier; + bool masked; + bool pending; + spinlock_t lock; /* * Entry in list of kvm->irqfd.resampler_list. Use for sharing * resamplers among irqfds on the same gsi. diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 50ddb1d1a7f0..9ff47ac33790 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -75,6 +75,44 @@ irqfd_resampler_ack(struct kvm_irq_ack_notifier *kian) kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID, resampler->notifier.gsi, 0, false); =20 + spin_lock(&resampler->lock); + if (resampler->masked) { + resampler->pending =3D true; + spin_unlock(&resampler->lock); + return; + } + spin_unlock(&resampler->lock); + + idx =3D srcu_read_lock(&kvm->irq_srcu); + + list_for_each_entry_srcu(irqfd, &resampler->list, resampler_link, + srcu_read_lock_held(&kvm->irq_srcu)) + eventfd_signal(irqfd->resamplefd, 1); + + srcu_read_unlock(&kvm->irq_srcu, idx); +} + +static void +irqfd_resampler_mask(struct kvm_irq_mask_notifier *kimn, bool masked) +{ + struct kvm_kernel_irqfd_resampler *resampler; + struct kvm *kvm; + struct kvm_kernel_irqfd *irqfd; + int idx; + + resampler =3D container_of(kimn, + struct kvm_kernel_irqfd_resampler, mask_notifier); + kvm =3D resampler->kvm; + + spin_lock(&resampler->lock); + resampler->masked =3D masked; + if (masked || !resampler->pending) { + spin_unlock(&resampler->lock); + return; + } + resampler->pending =3D false; + spin_unlock(&resampler->lock); + idx =3D srcu_read_lock(&kvm->irq_srcu); =20 list_for_each_entry_srcu(irqfd, &resampler->list, resampler_link, @@ -98,6 +136,8 @@ irqfd_resampler_shutdown(struct kvm_kernel_irqfd *irqfd) if (list_empty(&resampler->list)) { list_del(&resampler->link); kvm_unregister_irq_ack_notifier(kvm, &resampler->notifier); + kvm_unregister_irq_mask_notifier(kvm, resampler->mask_notifier.irq, + &resampler->mask_notifier); kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID, resampler->notifier.gsi, 0, false); kfree(resampler); @@ -367,11 +407,16 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *a= rgs) INIT_LIST_HEAD(&resampler->list); resampler->notifier.gsi =3D irqfd->gsi; resampler->notifier.irq_acked =3D irqfd_resampler_ack; + resampler->mask_notifier.func =3D irqfd_resampler_mask; + kvm_irq_is_masked(kvm, irqfd->gsi, &resampler->masked); + spin_lock_init(&resampler->lock); INIT_LIST_HEAD(&resampler->link); =20 list_add(&resampler->link, &kvm->irqfds.resampler_list); kvm_register_irq_ack_notifier(kvm, &resampler->notifier); + kvm_register_irq_mask_notifier(kvm, irqfd->gsi, + &resampler->mask_notifier); irqfd->resampler =3D resampler; } =20 --=20 2.37.0.170.g444d1eabd0-goog