[v2] swiotlb: Add child io tlb mem support

[RFC PATCH V2 0/2] swiotlb: Add child io tlb mem support

Posted by Tianyu Lan 4 years ago

From: Tianyu Lan <Tianyu.Lan@microsoft.com>

Traditionally swiotlb was not performance critical because it was only
used for slow devices. But in some setups, like TDX/SEV confidential
guests, all IO has to go through swiotlb. Currently swiotlb only has a
single lock. Under high IO load with multiple CPUs this can lead to
significant lock contention on the swiotlb lock.

This patch adds child IO TLB mem support to resolve spinlock overhead
among device's queues. Each device may allocate IO tlb mem and setup
child IO TLB mem according to queue number. The number child IO tlb
mem maybe set up equal with device queue number and this helps to resolve
swiotlb spinlock overhead among devices and queues.

Patch 2 introduces IO TLB Block concepts and swiotlb_device_allocate()
API to allocate per-device swiotlb bounce buffer. The new API Accepts
queue number as the number of child IO TLB mem to set up device's IO
TLB mem.

Tianyu Lan (2):
  swiotlb: Add Child IO TLB mem support
  Swiotlb: Add device bounce buffer allocation interface

 include/linux/swiotlb.h |  40 ++++++
 kernel/dma/swiotlb.c    | 290 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 317 insertions(+), 13 deletions(-)

-- 
2.25.1

Re: [RFC PATCH V2 0/2] swiotlb: Add child io tlb mem support

Posted by Tianyu Lan 4 years ago

On 5/2/2022 8:54 PM, Tianyu Lan wrote:
> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
> 
> Traditionally swiotlb was not performance critical because it was only
> used for slow devices. But in some setups, like TDX/SEV confidential
> guests, all IO has to go through swiotlb. Currently swiotlb only has a
> single lock. Under high IO load with multiple CPUs this can lead to
> significant lock contention on the swiotlb lock.
> 
> This patch adds child IO TLB mem support to resolve spinlock overhead
> among device's queues. Each device may allocate IO tlb mem and setup
> child IO TLB mem according to queue number. The number child IO tlb
> mem maybe set up equal with device queue number and this helps to resolve
> swiotlb spinlock overhead among devices and queues.
> 
> Patch 2 introduces IO TLB Block concepts and swiotlb_device_allocate()
> API to allocate per-device swiotlb bounce buffer. The new API Accepts
> queue number as the number of child IO TLB mem to set up device's IO
> TLB mem.

Gentile ping...

Thanks.
> 
> Tianyu Lan (2):
>    swiotlb: Add Child IO TLB mem support
>    Swiotlb: Add device bounce buffer allocation interface
> 
>   include/linux/swiotlb.h |  40 ++++++
>   kernel/dma/swiotlb.c    | 290 ++++++++++++++++++++++++++++++++++++++--
>   2 files changed, 317 insertions(+), 13 deletions(-)
>