include/linux/build_bug.h | 10 ++++++++++ kernel/dma/swiotlb.c | 37 +++++++++++++++++++++++++++++++------ 2 files changed, 41 insertions(+), 6 deletions(-)
From: Petr Tesarik <petr.tesarik1@huawei-partners.com> If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask covers some bits in the original address between IO_TLB_SIZE and alloc_align_mask, preserve these bits by allocating additional padding slots before the actual swiotlb buffer. Changes from v2 --------------- * Fix assignment of an uninitialized variable to pad_slots. * Improve commit message wrt INVALID_PHYS_ADDR. Changes from v1 --------------- * Rename padding to pad_slots. * Set pad_slots only for the first allocated non-padding slot. * Do not bother initializing orig_addr to INVALID_PHYS_ADDR. * Change list and pad_slots to unsigned short to avoid growing struct io_tlb_slot on 32-bit targets. * Add build-time check that list and pad_slots can hold the maximum allowed value of IO_TLB_SEGSIZE. Petr Tesarik (2): swiotlb: extend buffer pre-padding to alloc_align_mask if necessary bug: introduce ASSERT_VAR_CAN_HOLD() include/linux/build_bug.h | 10 ++++++++++ kernel/dma/swiotlb.c | 37 +++++++++++++++++++++++++++++++------ 2 files changed, 41 insertions(+), 6 deletions(-) -- 2.34.1
Hi Petr, On Thu, Mar 21, 2024 at 06:19:00PM +0100, Petr Tesarik wrote: > From: Petr Tesarik <petr.tesarik1@huawei-partners.com> > > If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask > covers some bits in the original address between IO_TLB_SIZE and > alloc_align_mask, preserve these bits by allocating additional padding > slots before the actual swiotlb buffer. Thanks for fixing this! I was out at a conference last week, so I didn't get very far with it myself, but I ended up in a pickle trying to avoid extending 'struct io_tlb_slot'. Your solution is much better than the crazy avenue I started going down... With your changes, can we now simplify swiotlb_align_offset() to ignore dma_get_min_align_mask() altogether and just: return addr & (IO_TLB_SIZE - 1); ? Will
On Fri, 22 Mar 2024 15:09:41 +0000
Will Deacon <will@kernel.org> wrote:
> Hi Petr,
>
> On Thu, Mar 21, 2024 at 06:19:00PM +0100, Petr Tesarik wrote:
> > From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
> >
> > If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask
> > covers some bits in the original address between IO_TLB_SIZE and
> > alloc_align_mask, preserve these bits by allocating additional padding
> > slots before the actual swiotlb buffer.
>
> Thanks for fixing this! I was out at a conference last week, so I didn't
> get very far with it myself, but I ended up in a pickle trying to avoid
> extending 'struct io_tlb_slot'. Your solution is much better than the
> crazy avenue I started going down...
>
> With your changes, can we now simplify swiotlb_align_offset() to ignore
> dma_get_min_align_mask() altogether and just:
>
> return addr & (IO_TLB_SIZE - 1);
I have also thought about this but I don't think it's right. If we
removed dma_get_min_align_mask() from swiotlb_align_offset(), we would
always ask to preserve the lowest IO_TLB_SHIFT bits. This may cause
less efficient use of the SWIOTLB.
For example, if a device does not specify any min_align_mask, it is
presumably happy with any buffer alignment, so SWIOTLB may allocate at
the beginning of a slot, like here:
orig_addr | ++|++ |
tlb_addr |++++ | |
Without dma_get_min_align_mask() in swiotlb_align_offset(), it would
have to allocate two mostly-empty slots:
tlb_addr | ++|++ |
where:
| mark a multiple of IO_TLB_SIZE (in physical address space)
+ used memory
free memory
Petr T
On Fri, Mar 22, 2024 at 06:51:38PM +0100, Petr Tesařík wrote: > On Fri, 22 Mar 2024 15:09:41 +0000 > Will Deacon <will@kernel.org> wrote: > > > Hi Petr, > > > > On Thu, Mar 21, 2024 at 06:19:00PM +0100, Petr Tesarik wrote: > > > From: Petr Tesarik <petr.tesarik1@huawei-partners.com> > > > > > > If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask > > > covers some bits in the original address between IO_TLB_SIZE and > > > alloc_align_mask, preserve these bits by allocating additional padding > > > slots before the actual swiotlb buffer. > > > > Thanks for fixing this! I was out at a conference last week, so I didn't > > get very far with it myself, but I ended up in a pickle trying to avoid > > extending 'struct io_tlb_slot'. Your solution is much better than the > > crazy avenue I started going down... > > > > With your changes, can we now simplify swiotlb_align_offset() to ignore > > dma_get_min_align_mask() altogether and just: > > > > return addr & (IO_TLB_SIZE - 1); > > I have also thought about this but I don't think it's right. If we > removed dma_get_min_align_mask() from swiotlb_align_offset(), we would > always ask to preserve the lowest IO_TLB_SHIFT bits. This may cause > less efficient use of the SWIOTLB. > > For example, if a device does not specify any min_align_mask, it is > presumably happy with any buffer alignment, so SWIOTLB may allocate at > the beginning of a slot, like here: > > orig_addr | ++|++ | > tlb_addr |++++ | | > > Without dma_get_min_align_mask() in swiotlb_align_offset(), it would > have to allocate two mostly-empty slots: > > tlb_addr | ++|++ | > > where: > | mark a multiple of IO_TLB_SIZE (in physical address space) > + used memory > free memory Thanks for the patient explanation. I'd got so caught up with the DMA alignment mask that I forgot the usual case where it's not specified at all! Will
© 2016 - 2026 Red Hat, Inc.