[PATCH v2 0/8] irqbypass: Cleanups and a perf improvement

Sean Christopherson posted 8 patches 7 months ago
arch/x86/kvm/x86.c                |   4 +-
drivers/vfio/pci/vfio_pci_intrs.c |  10 +-
drivers/vhost/vdpa.c              |  10 +-
include/linux/irqbypass.h         |  46 ++++----
virt/kvm/eventfd.c                |   7 +-
virt/lib/irqbypass.c              | 190 +++++++++++-------------------
6 files changed, 107 insertions(+), 160 deletions(-)
[PATCH v2 0/8] irqbypass: Cleanups and a perf improvement
Posted by Sean Christopherson 7 months ago
The two primary goals of this series are to make the irqbypass concept
easier to understand, and to address the terrible performance that can
result from using a list to track connections.

For the first goal, track the producer/consumer "tokens" as eventfd context
pointers instead of opaque "void *".  Supporting arbitrary token types was
dead infrastructure when it was added 10 years ago, and nothing has changed
since.  Taking an opaque token makes a very simple concept (device signals
eventfd; KVM listens to eventfd) unnecessarily difficult to understand.

Burying that simple behind a layer of obfuscation also makes the overall
code more brittle, as callers can pass in literally anything. I.e. passing
in a token that will never be paired would go unnoticed.

For the performance issue, use an xarray.  I'm definitely not wedded to an
xarray, but IMO it doesn't add meaningful complexity (even requires less
code), and pretty much Just Works.  Like tried this a while back[1], but
the implementation had undesirable behavior changes and stalled out.

Note, I want to do more aggressive cleanups of irqbypass at some point,
e.g. not reporting an error to userspace if connect() fails is awful
behavior for environments that want/need irqbypass to always work.  And
KVM shold probably have a KVM_IRQFD_FLAG_NO_IRQBYPASS if a VM is never going
to use device posted interrupts.  But those are future problems.

v2:
 - Collect reviews. [Kevin, Michael]
 - Track the pointer as "struct eventfd_ctx *eventfd" instead of "void *token".
   [Alex]
 - Fix typos and stale comments. [Kevin, Binbin]
 - Use "trigger" instead of the null token/eventfd pointer on failure in
   vfio_msi_set_vector_signal(). [Kevin]
 - Drop a redundant "tmp == consumer" check from patch 3. [Kevin]
 - Require producers to pass in the line IRQ number.

v1: https://lore.kernel.org/all/20250404211449.1443336-1-seanjc@google.com

[1] https://lore.kernel.org/all/20230801115646.33990-1-likexu@tencent.com
[2] https://lore.kernel.org/all/20250401161804.842968-1-seanjc@google.com

Sean Christopherson (8):
  irqbypass: Drop pointless and misleading THIS_MODULE get/put
  irqbypass: Drop superfluous might_sleep() annotations
  irqbypass: Take ownership of producer/consumer token tracking
  irqbypass: Explicitly track producer and consumer bindings
  irqbypass: Use paired consumer/producer to disconnect during
    unregister
  irqbypass: Use guard(mutex) in lieu of manual lock+unlock
  irqbypass: Use xarray to track producers and consumers
  irqbypass: Require producers to pass in Linux IRQ number during
    registration

 arch/x86/kvm/x86.c                |   4 +-
 drivers/vfio/pci/vfio_pci_intrs.c |  10 +-
 drivers/vhost/vdpa.c              |  10 +-
 include/linux/irqbypass.h         |  46 ++++----
 virt/kvm/eventfd.c                |   7 +-
 virt/lib/irqbypass.c              | 190 +++++++++++-------------------
 6 files changed, 107 insertions(+), 160 deletions(-)


base-commit: 7ef51a41466bc846ad794d505e2e34ff97157f7f
-- 
2.49.0.1112.g889b7c5bd8-goog
Re: [PATCH v2 0/8] irqbypass: Cleanups and a perf improvement
Posted by Sean Christopherson 5 months, 3 weeks ago
On Fri, 16 May 2025 16:07:26 -0700, Sean Christopherson wrote:
> The two primary goals of this series are to make the irqbypass concept
> easier to understand, and to address the terrible performance that can
> result from using a list to track connections.
> 
> For the first goal, track the producer/consumer "tokens" as eventfd context
> pointers instead of opaque "void *".  Supporting arbitrary token types was
> dead infrastructure when it was added 10 years ago, and nothing has changed
> since.  Taking an opaque token makes a very simple concept (device signals
> eventfd; KVM listens to eventfd) unnecessarily difficult to understand.
> 
> [...]

Applied to kvm-x86 irqs, thanks!

[1/8] irqbypass: Drop pointless and misleading THIS_MODULE get/put
      https://github.com/kvm-x86/linux/commit/fa079a0616ed
[2/8] irqbypass: Drop superfluous might_sleep() annotations
      https://github.com/kvm-x86/linux/commit/07fbc83c0152
[3/8] irqbypass: Take ownership of producer/consumer token tracking
      https://github.com/kvm-x86/linux/commit/2b521d86ee80
[4/8] irqbypass: Explicitly track producer and consumer bindings
      https://github.com/kvm-x86/linux/commit/add57f493e08
[5/8] irqbypass: Use paired consumer/producer to disconnect during unregister
      https://github.com/kvm-x86/linux/commit/5d7dbdce388b
[6/8] irqbypass: Use guard(mutex) in lieu of manual lock+unlock
      https://github.com/kvm-x86/linux/commit/46a4bfd0ae48
[7/8] irqbypass: Use xarray to track producers and consumers
      https://github.com/kvm-x86/linux/commit/8394b32faecd
[8/8] irqbypass: Require producers to pass in Linux IRQ number during registration
      https://github.com/kvm-x86/linux/commit/23b54381cee2

--
https://github.com/kvm-x86/kvm-unit-tests/tree/next
Re: [PATCH v2 0/8] irqbypass: Cleanups and a perf improvement
Posted by Alex Williamson 6 months, 2 weeks ago
On Fri, 16 May 2025 16:07:26 -0700
Sean Christopherson <seanjc@google.com> wrote:

> The two primary goals of this series are to make the irqbypass concept
> easier to understand, and to address the terrible performance that can
> result from using a list to track connections.
> 
> For the first goal, track the producer/consumer "tokens" as eventfd context
> pointers instead of opaque "void *".  Supporting arbitrary token types was
> dead infrastructure when it was added 10 years ago, and nothing has changed
> since.  Taking an opaque token makes a very simple concept (device signals
> eventfd; KVM listens to eventfd) unnecessarily difficult to understand.
> 
> Burying that simple behind a layer of obfuscation also makes the overall
> code more brittle, as callers can pass in literally anything. I.e. passing
> in a token that will never be paired would go unnoticed.
> 
> For the performance issue, use an xarray.  I'm definitely not wedded to an
> xarray, but IMO it doesn't add meaningful complexity (even requires less
> code), and pretty much Just Works.  Like tried this a while back[1], but
> the implementation had undesirable behavior changes and stalled out.
> 
> Note, I want to do more aggressive cleanups of irqbypass at some point,
> e.g. not reporting an error to userspace if connect() fails is awful
> behavior for environments that want/need irqbypass to always work.  And
> KVM shold probably have a KVM_IRQFD_FLAG_NO_IRQBYPASS if a VM is never going
> to use device posted interrupts.  But those are future problems.
> 
> v2:
>  - Collect reviews. [Kevin, Michael]
>  - Track the pointer as "struct eventfd_ctx *eventfd" instead of "void *token".
>    [Alex]
>  - Fix typos and stale comments. [Kevin, Binbin]
>  - Use "trigger" instead of the null token/eventfd pointer on failure in
>    vfio_msi_set_vector_signal(). [Kevin]
>  - Drop a redundant "tmp == consumer" check from patch 3. [Kevin]
>  - Require producers to pass in the line IRQ number.
> 
> v1: https://lore.kernel.org/all/20250404211449.1443336-1-seanjc@google.com
> 
> [1] https://lore.kernel.org/all/20230801115646.33990-1-likexu@tencent.com
> [2] https://lore.kernel.org/all/20250401161804.842968-1-seanjc@google.com
> 
> Sean Christopherson (8):
>   irqbypass: Drop pointless and misleading THIS_MODULE get/put
>   irqbypass: Drop superfluous might_sleep() annotations
>   irqbypass: Take ownership of producer/consumer token tracking
>   irqbypass: Explicitly track producer and consumer bindings
>   irqbypass: Use paired consumer/producer to disconnect during
>     unregister
>   irqbypass: Use guard(mutex) in lieu of manual lock+unlock
>   irqbypass: Use xarray to track producers and consumers
>   irqbypass: Require producers to pass in Linux IRQ number during
>     registration
> 
>  arch/x86/kvm/x86.c                |   4 +-
>  drivers/vfio/pci/vfio_pci_intrs.c |  10 +-
>  drivers/vhost/vdpa.c              |  10 +-
>  include/linux/irqbypass.h         |  46 ++++----
>  virt/kvm/eventfd.c                |   7 +-
>  virt/lib/irqbypass.c              | 190 +++++++++++-------------------
>  6 files changed, 107 insertions(+), 160 deletions(-)
> 
> 
> base-commit: 7ef51a41466bc846ad794d505e2e34ff97157f7f

Sorry for the delay.  Do you intend to take this through your trees?

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Re: [PATCH v2 0/8] irqbypass: Cleanups and a perf improvement
Posted by Sean Christopherson 6 months, 2 weeks ago
On Mon, Jun 02, 2025, Alex Williamson wrote:
> On Fri, 16 May 2025 16:07:26 -0700
> Sean Christopherson <seanjc@google.com> wrote:
> 
> > The two primary goals of this series are to make the irqbypass concept
> > easier to understand, and to address the terrible performance that can
> > result from using a list to track connections.
> > 
> > For the first goal, track the producer/consumer "tokens" as eventfd context
> > pointers instead of opaque "void *".  Supporting arbitrary token types was
> > dead infrastructure when it was added 10 years ago, and nothing has changed
> > since.  Taking an opaque token makes a very simple concept (device signals
> > eventfd; KVM listens to eventfd) unnecessarily difficult to understand.
> > 
> > Burying that simple behind a layer of obfuscation also makes the overall
> > code more brittle, as callers can pass in literally anything. I.e. passing
> > in a token that will never be paired would go unnoticed.
> > 
> > For the performance issue, use an xarray.  I'm definitely not wedded to an
> > xarray, but IMO it doesn't add meaningful complexity (even requires less
> > code), and pretty much Just Works.  Like tried this a while back[1], but
> > the implementation had undesirable behavior changes and stalled out.
> > 
> > Note, I want to do more aggressive cleanups of irqbypass at some point,
> > e.g. not reporting an error to userspace if connect() fails is awful
> > behavior for environments that want/need irqbypass to always work.  And
> > KVM shold probably have a KVM_IRQFD_FLAG_NO_IRQBYPASS if a VM is never going
> > to use device posted interrupts.  But those are future problems.
> > 
> > v2:
> >  - Collect reviews. [Kevin, Michael]
> >  - Track the pointer as "struct eventfd_ctx *eventfd" instead of "void *token".
> >    [Alex]
> >  - Fix typos and stale comments. [Kevin, Binbin]
> >  - Use "trigger" instead of the null token/eventfd pointer on failure in
> >    vfio_msi_set_vector_signal(). [Kevin]
> >  - Drop a redundant "tmp == consumer" check from patch 3. [Kevin]
> >  - Require producers to pass in the line IRQ number.
> > 
> > v1: https://lore.kernel.org/all/20250404211449.1443336-1-seanjc@google.com
> > 
> > [1] https://lore.kernel.org/all/20230801115646.33990-1-likexu@tencent.com
> > [2] https://lore.kernel.org/all/20250401161804.842968-1-seanjc@google.com
> > 
> > Sean Christopherson (8):
> >   irqbypass: Drop pointless and misleading THIS_MODULE get/put
> >   irqbypass: Drop superfluous might_sleep() annotations
> >   irqbypass: Take ownership of producer/consumer token tracking
> >   irqbypass: Explicitly track producer and consumer bindings
> >   irqbypass: Use paired consumer/producer to disconnect during
> >     unregister
> >   irqbypass: Use guard(mutex) in lieu of manual lock+unlock
> >   irqbypass: Use xarray to track producers and consumers
> >   irqbypass: Require producers to pass in Linux IRQ number during
> >     registration
> > 
> >  arch/x86/kvm/x86.c                |   4 +-
> >  drivers/vfio/pci/vfio_pci_intrs.c |  10 +-
> >  drivers/vhost/vdpa.c              |  10 +-
> >  include/linux/irqbypass.h         |  46 ++++----
> >  virt/kvm/eventfd.c                |   7 +-
> >  virt/lib/irqbypass.c              | 190 +++++++++++-------------------
> >  6 files changed, 107 insertions(+), 160 deletions(-)
> > 
> > 
> > base-commit: 7ef51a41466bc846ad794d505e2e34ff97157f7f
> 
> Sorry for the delay.

Heh, no worries.  ~2 weeks is downright prompt by my standards ;-)

> Do you intend to take this through your trees?

Yes, ideally, it would go into Paolo's kvm/next sooner than later (I'll start
poking him if necessary).  The s/token/eventfd rename creates an annoying conflict
in kvm/x86.c with an in-flight patch (significant code movement between files).
It would be nice to be able to rebase the in-flight patch instead of having to
resolve a merge confict (the conflict itself isn't difficult to resolve, I just
find it hard to visually review/audit the resolution due to the code movement).
Re: [PATCH v2 0/8] irqbypass: Cleanups and a perf improvement
Posted by Michael S. Tsirkin 7 months ago
On Fri, May 16, 2025 at 04:07:26PM -0700, Sean Christopherson wrote:
> The two primary goals of this series are to make the irqbypass concept
> easier to understand, and to address the terrible performance that can
> result from using a list to track connections.
> 
> For the first goal, track the producer/consumer "tokens" as eventfd context
> pointers instead of opaque "void *".  Supporting arbitrary token types was
> dead infrastructure when it was added 10 years ago, and nothing has changed
> since.  Taking an opaque token makes a very simple concept (device signals
> eventfd; KVM listens to eventfd) unnecessarily difficult to understand.
> 
> Burying that simple behind a layer of obfuscation also makes the overall
> code more brittle, as callers can pass in literally anything. I.e. passing
> in a token that will never be paired would go unnoticed.
> 
> For the performance issue, use an xarray.  I'm definitely not wedded to an
> xarray, but IMO it doesn't add meaningful complexity (even requires less
> code), and pretty much Just Works.  Like tried this a while back[1], but
> the implementation had undesirable behavior changes and stalled out.
> 
> Note, I want to do more aggressive cleanups of irqbypass at some point,
> e.g. not reporting an error to userspace if connect() fails is awful
> behavior for environments that want/need irqbypass to always work.  And
> KVM shold probably have a KVM_IRQFD_FLAG_NO_IRQBYPASS if a VM is never going
> to use device posted interrupts.  But those are future problems.
> 
> v2:
>  - Collect reviews. [Kevin, Michael]
>  - Track the pointer as "struct eventfd_ctx *eventfd" instead of "void *token".
>    [Alex]
>  - Fix typos and stale comments. [Kevin, Binbin]
>  - Use "trigger" instead of the null token/eventfd pointer on failure in
>    vfio_msi_set_vector_signal(). [Kevin]
>  - Drop a redundant "tmp == consumer" check from patch 3. [Kevin]
>  - Require producers to pass in the line IRQ number.


VDPA bits:

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> v1: https://lore.kernel.org/all/20250404211449.1443336-1-seanjc@google.com
> 
> [1] https://lore.kernel.org/all/20230801115646.33990-1-likexu@tencent.com
> [2] https://lore.kernel.org/all/20250401161804.842968-1-seanjc@google.com
> 
> Sean Christopherson (8):
>   irqbypass: Drop pointless and misleading THIS_MODULE get/put
>   irqbypass: Drop superfluous might_sleep() annotations
>   irqbypass: Take ownership of producer/consumer token tracking
>   irqbypass: Explicitly track producer and consumer bindings
>   irqbypass: Use paired consumer/producer to disconnect during
>     unregister
>   irqbypass: Use guard(mutex) in lieu of manual lock+unlock
>   irqbypass: Use xarray to track producers and consumers
>   irqbypass: Require producers to pass in Linux IRQ number during
>     registration
> 
>  arch/x86/kvm/x86.c                |   4 +-
>  drivers/vfio/pci/vfio_pci_intrs.c |  10 +-
>  drivers/vhost/vdpa.c              |  10 +-
>  include/linux/irqbypass.h         |  46 ++++----
>  virt/kvm/eventfd.c                |   7 +-
>  virt/lib/irqbypass.c              | 190 +++++++++++-------------------
>  6 files changed, 107 insertions(+), 160 deletions(-)
> 
> 
> base-commit: 7ef51a41466bc846ad794d505e2e34ff97157f7f
> -- 
> 2.49.0.1112.g889b7c5bd8-goog