[v3] swiotlb: allocate padding slots if necessary

[PATCH v3 0/2] swiotlb: allocate padding slots if necessary

Posted by Petr Tesarik 1 year, 10 months ago

From: Petr Tesarik <petr.tesarik1@huawei-partners.com>

If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask
covers some bits in the original address between IO_TLB_SIZE and
alloc_align_mask, preserve these bits by allocating additional padding
slots before the actual swiotlb buffer.

Changes from v2
---------------
* Fix assignment of an uninitialized variable to pad_slots.
* Improve commit message wrt INVALID_PHYS_ADDR.

Changes from v1
---------------
* Rename padding to pad_slots.
* Set pad_slots only for the first allocated non-padding slot.
* Do not bother initializing orig_addr to INVALID_PHYS_ADDR.
* Change list and pad_slots to unsigned short to avoid growing
  struct io_tlb_slot on 32-bit targets.
* Add build-time check that list and pad_slots can hold the maximum
  allowed value of IO_TLB_SEGSIZE.

Petr Tesarik (2):
  swiotlb: extend buffer pre-padding to alloc_align_mask if necessary
  bug: introduce ASSERT_VAR_CAN_HOLD()

 include/linux/build_bug.h | 10 ++++++++++
 kernel/dma/swiotlb.c      | 37 +++++++++++++++++++++++++++++++------
 2 files changed, 41 insertions(+), 6 deletions(-)

-- 
2.34.1

Re: [PATCH v3 0/2] swiotlb: allocate padding slots if necessary

Posted by Will Deacon 1 year, 10 months ago

Hi Petr,

On Thu, Mar 21, 2024 at 06:19:00PM +0100, Petr Tesarik wrote:
> From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
> 
> If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask
> covers some bits in the original address between IO_TLB_SIZE and
> alloc_align_mask, preserve these bits by allocating additional padding
> slots before the actual swiotlb buffer.

Thanks for fixing this! I was out at a conference last week, so I didn't
get very far with it myself, but I ended up in a pickle trying to avoid
extending 'struct io_tlb_slot'. Your solution is much better than the
crazy avenue I started going down...

With your changes, can we now simplify swiotlb_align_offset() to ignore
dma_get_min_align_mask() altogether and just:

	return addr & (IO_TLB_SIZE - 1);

?

Will

Re: [PATCH v3 0/2] swiotlb: allocate padding slots if necessary

Posted by Petr Tesařík 1 year, 10 months ago

On Fri, 22 Mar 2024 15:09:41 +0000
Will Deacon <will@kernel.org> wrote:

> Hi Petr,
> 
> On Thu, Mar 21, 2024 at 06:19:00PM +0100, Petr Tesarik wrote:
> > From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
> > 
> > If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask
> > covers some bits in the original address between IO_TLB_SIZE and
> > alloc_align_mask, preserve these bits by allocating additional padding
> > slots before the actual swiotlb buffer.  
> 
> Thanks for fixing this! I was out at a conference last week, so I didn't
> get very far with it myself, but I ended up in a pickle trying to avoid
> extending 'struct io_tlb_slot'. Your solution is much better than the
> crazy avenue I started going down...
> 
> With your changes, can we now simplify swiotlb_align_offset() to ignore
> dma_get_min_align_mask() altogether and just:
> 
> 	return addr & (IO_TLB_SIZE - 1);

I have also thought about this but I don't think it's right. If we
removed dma_get_min_align_mask() from swiotlb_align_offset(), we would
always ask to preserve the lowest IO_TLB_SHIFT bits. This may cause
less efficient use of the SWIOTLB.

For example, if a device does not specify any min_align_mask, it is
presumably happy with any buffer alignment, so SWIOTLB may allocate at
the beginning of a slot, like here:

orig_addr   |      ++|++      |
tlb_addr    |++++    |        |

Without dma_get_min_align_mask() in swiotlb_align_offset(), it would
have to allocate two mostly-empty slots:

tlb_addr    |      ++|++      |

where:
  | mark a multiple of IO_TLB_SIZE (in physical address space)
  + used memory
    free memory

Petr T

Re: [PATCH v3 0/2] swiotlb: allocate padding slots if necessary

Posted by Will Deacon 1 year, 10 months ago

On Fri, Mar 22, 2024 at 06:51:38PM +0100, Petr Tesařík wrote:
> On Fri, 22 Mar 2024 15:09:41 +0000
> Will Deacon <will@kernel.org> wrote:
> 
> > Hi Petr,
> > 
> > On Thu, Mar 21, 2024 at 06:19:00PM +0100, Petr Tesarik wrote:
> > > From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
> > > 
> > > If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask
> > > covers some bits in the original address between IO_TLB_SIZE and
> > > alloc_align_mask, preserve these bits by allocating additional padding
> > > slots before the actual swiotlb buffer.  
> > 
> > Thanks for fixing this! I was out at a conference last week, so I didn't
> > get very far with it myself, but I ended up in a pickle trying to avoid
> > extending 'struct io_tlb_slot'. Your solution is much better than the
> > crazy avenue I started going down...
> > 
> > With your changes, can we now simplify swiotlb_align_offset() to ignore
> > dma_get_min_align_mask() altogether and just:
> > 
> > 	return addr & (IO_TLB_SIZE - 1);
> 
> I have also thought about this but I don't think it's right. If we
> removed dma_get_min_align_mask() from swiotlb_align_offset(), we would
> always ask to preserve the lowest IO_TLB_SHIFT bits. This may cause
> less efficient use of the SWIOTLB.
> 
> For example, if a device does not specify any min_align_mask, it is
> presumably happy with any buffer alignment, so SWIOTLB may allocate at
> the beginning of a slot, like here:
> 
> orig_addr   |      ++|++      |
> tlb_addr    |++++    |        |
> 
> Without dma_get_min_align_mask() in swiotlb_align_offset(), it would
> have to allocate two mostly-empty slots:
> 
> tlb_addr    |      ++|++      |
> 
> where:
>   | mark a multiple of IO_TLB_SIZE (in physical address space)
>   + used memory
>     free memory

Thanks for the patient explanation. I'd got so caught up with the DMA
alignment mask that I forgot the usual case where it's not specified at
all!

Will