RE: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size

Hui Fang posted 1 patch 2 years, 3 months ago
RE: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Hui Fang 2 years, 3 months ago
On Wed, Sep 6, 2023 at 18:28 PM Tomasz Figa <tfiga@chromium.org> wrote:
> That all makes sense, but it still doesn't answer the real question on why
> swiotlb ends up being used. I think you may want to trace what happens in
> the DMA mapping ops implementation on your system causing it to use
> swiotlb.

Add log and feed invalid data to low buffer on purpose,
it's confirmed that swiotlb is actually used.

Got log as
"[  846.570271][  T138] software IO TLB: ==== swiotlb_bounce: DMA_TO_DEVICE,
 dst 000000004589fa38, src 00000000c6d7e8d8, srcPhy 5504139264, size 4096".

" srcPhy 5504139264" is larger than 4G (8mp has DRAM over 5G).
And "CONFIG_ZONE_DMA32=y" in kernel config, so swiotlb static is used.
Also, the host (win10) side can't get valid image.

Code as below.
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
index 7f83a86e6810..de03704ce695 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
@@ -98,6 +98,7 @@ static int vb2_dma_sg_alloc_compacted(struct vb2_dma_sg_buf *buf,
        return 0;
 }
 
+bool g_v4l2 = false;
 static void *vb2_dma_sg_alloc(struct vb2_buffer *vb, struct device *dev,
                              unsigned long size)
 {
@@ -144,6 +145,7 @@ static void *vb2_dma_sg_alloc(struct vb2_buffer *vb, struct device *dev,
        if (ret)
                goto fail_table_alloc;
 
+       g_v4l2 = true;
        pr_info("==== vb2_dma_sg_alloc, call sg_alloc_table_from_pages_segment,
			size %d, max_segment %d\n", (int)size, (int)max_segment);
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index dac01ace03a0..a2cda646a02f 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -523,6 +523,7 @@ static unsigned int swiotlb_align_offset(struct device *dev, u64 addr)
        return addr & dma_get_min_align_mask(dev) & (IO_TLB_SIZE - 1);
 }
 
+extern bool g_v4l2;
 /*
  * Bounce: copy the swiotlb buffer from or back to the original dma location
  */
@@ -591,8 +592,19 @@ static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size
                }
        } else if (dir == DMA_TO_DEVICE) {
                memcpy(vaddr, phys_to_virt(orig_addr), size);
+               if (g_v4l2) {
+                       static unsigned char val;
+                       val++;
+                       memset(vaddr, val, size);
+
+                       pr_info("====xx %s: DMA_TO_DEVICE, dst %p, src %p, srcPhy %llu, size %zu\n",
+                               __func__, vaddr, phys_to_virt(orig_addr), orig_addr, size);
+               }
        } else {
                memcpy(phys_to_virt(orig_addr), vaddr, size);
        }
 }


BRs,
Fang Hui

Re: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Tomasz Figa 2 years, 3 months ago
On Mon, Sep 11, 2023 at 3:13 PM Hui Fang <hui.fang@nxp.com> wrote:
>
> On Wed, Sep 6, 2023 at 18:28 PM Tomasz Figa <tfiga@chromium.org> wrote:
> > That all makes sense, but it still doesn't answer the real question on why
> > swiotlb ends up being used. I think you may want to trace what happens in
> > the DMA mapping ops implementation on your system causing it to use
> > swiotlb.
>
> Add log and feed invalid data to low buffer on purpose,
> it's confirmed that swiotlb is actually used.
>

Yes, that we already know. But why?

> Got log as
> "[  846.570271][  T138] software IO TLB: ==== swiotlb_bounce: DMA_TO_DEVICE,
>  dst 000000004589fa38, src 00000000c6d7e8d8, srcPhy 5504139264, size 4096".
>
> " srcPhy 5504139264" is larger than 4G (8mp has DRAM over 5G).
> And "CONFIG_ZONE_DMA32=y" in kernel config, so swiotlb static is used.
> Also, the host (win10) side can't get valid image.
>
> Code as below.
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> index 7f83a86e6810..de03704ce695 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> @@ -98,6 +98,7 @@ static int vb2_dma_sg_alloc_compacted(struct vb2_dma_sg_buf *buf,
>         return 0;
>  }
>
> +bool g_v4l2 = false;
>  static void *vb2_dma_sg_alloc(struct vb2_buffer *vb, struct device *dev,
>                               unsigned long size)
>  {
> @@ -144,6 +145,7 @@ static void *vb2_dma_sg_alloc(struct vb2_buffer *vb, struct device *dev,
>         if (ret)
>                 goto fail_table_alloc;
>
> +       g_v4l2 = true;
>         pr_info("==== vb2_dma_sg_alloc, call sg_alloc_table_from_pages_segment,
>                         size %d, max_segment %d\n", (int)size, (int)max_segment);
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index dac01ace03a0..a2cda646a02f 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -523,6 +523,7 @@ static unsigned int swiotlb_align_offset(struct device *dev, u64 addr)
>         return addr & dma_get_min_align_mask(dev) & (IO_TLB_SIZE - 1);
>  }
>
> +extern bool g_v4l2;
>  /*
>   * Bounce: copy the swiotlb buffer from or back to the original dma location
>   */
> @@ -591,8 +592,19 @@ static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size
>                 }
>         } else if (dir == DMA_TO_DEVICE) {
>                 memcpy(vaddr, phys_to_virt(orig_addr), size);
> +               if (g_v4l2) {
> +                       static unsigned char val;
> +                       val++;
> +                       memset(vaddr, val, size);
> +
> +                       pr_info("====xx %s: DMA_TO_DEVICE, dst %p, src %p, srcPhy %llu, size %zu\n",
> +                               __func__, vaddr, phys_to_virt(orig_addr), orig_addr, size);
> +               }
>         } else {
>                 memcpy(phys_to_virt(orig_addr), vaddr, size);
>         }
>  }
>
>
> BRs,
> Fang Hui
>
RE: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Hui Fang 2 years, 3 months ago
On Tue, Sep 12 2023 at 11:22 AM Tomasz Figa <tfiga@chromium.org>
> On Mon, Sep 11, 2023 at 3:13 PM Hui Fang <hui.fang@nxp.com> wrote:
> >
> > On Wed, Sep 6, 2023 at 18:28 PM Tomasz Figa <tfiga@chromium.org>
> wrote:
> > > That all makes sense, but it still doesn't answer the real question
> > > on why swiotlb ends up being used. I think you may want to trace
> > > what happens in the DMA mapping ops implementation on your system
> > > causing it to use swiotlb.
> >
> > Add log and feed invalid data to low buffer on purpose, it's confirmed
> > that swiotlb is actually used.
> >
> 
> Yes, that we already know. But why?


The physical address of v4l2 buffer is large than 4G (5504139264), so the swiotlb is used.
"[  846.570271][  T138] software IO TLB: ==== swiotlb_bounce: DMA_TO_DEVICE,
 dst 000000004589fa38, src 00000000c6d7e8d8, srcPhy 5504139264, size 4096".

BRs,
Fang Hui
Re: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Tomasz Figa 2 years, 3 months ago
On Tue, Sep 12, 2023 at 4:01 PM Hui Fang <hui.fang@nxp.com> wrote:
>
> On Tue, Sep 12 2023 at 11:22 AM Tomasz Figa <tfiga@chromium.org>
> > On Mon, Sep 11, 2023 at 3:13 PM Hui Fang <hui.fang@nxp.com> wrote:
> > >
> > > On Wed, Sep 6, 2023 at 18:28 PM Tomasz Figa <tfiga@chromium.org>
> > wrote:
> > > > That all makes sense, but it still doesn't answer the real question
> > > > on why swiotlb ends up being used. I think you may want to trace
> > > > what happens in the DMA mapping ops implementation on your system
> > > > causing it to use swiotlb.
> > >
> > > Add log and feed invalid data to low buffer on purpose, it's confirmed
> > > that swiotlb is actually used.
> > >
> >
> > Yes, that we already know. But why?
>
>
> The physical address of v4l2 buffer is large than 4G (5504139264), so the swiotlb is used.
> "[  846.570271][  T138] software IO TLB: ==== swiotlb_bounce: DMA_TO_DEVICE,
>  dst 000000004589fa38, src 00000000c6d7e8d8, srcPhy 5504139264, size 4096".

Is your DMA device restricted only to the bottom-most 4 GB (32-bit DMA
address)? If yes, would it make sense to also allocate from that area
rather than bouncing the memory?

Best regards,
Tomasz
RE: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Hui Fang 2 years, 3 months ago
On Tue, Sep 12, 2023 at 4:11 PM Tomasz Figa <tfiga@chromium.org> wrote:
> Is your DMA device restricted only to the bottom-most 4 GB (32-bit DMA
> address)? If yes, would it make sense to also allocate from that area rather
> than bouncing the memory?

The DMA device use 32-bit DMA address.
From user space, can't control the v4l2 buffer address, may still change the
code of vb2_dma_sg_alloc().

BRs,
Fang Hui
Re: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Tomasz Figa 2 years, 3 months ago
On Tue, Sep 12, 2023 at 4:43 PM Hui Fang <hui.fang@nxp.com> wrote:
>
> On Tue, Sep 12, 2023 at 4:11 PM Tomasz Figa <tfiga@chromium.org> wrote:
> > Is your DMA device restricted only to the bottom-most 4 GB (32-bit DMA
> > address)? If yes, would it make sense to also allocate from that area rather
> > than bouncing the memory?
>
> The DMA device use 32-bit DMA address.
> From user space, can't control the v4l2 buffer address, may still change the
> code of vb2_dma_sg_alloc().

Right. You may want to try modifying vb2_dma_sg_alloc_compacted() to
use dma_alloc_pages() instead of alloc_pages().

Best regards,
Tomasz
RE: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Hui Fang 2 years, 3 months ago
On Tue, Sep 12, 2023 at 16:52 PM Tomasz Figa <tfiga@chromium.org> wrote:
> Right. You may want to try modifying vb2_dma_sg_alloc_compacted() to use
> dma_alloc_pages() instead of alloc_pages().

Thanks for your suggestion, it works. And it's a better resolution since no need
an extra copy from high buffer to low buffer.

BRs,
Fang Hui
Re: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Tomasz Figa 2 years, 3 months ago
On Wed, Sep 13, 2023 at 6:14 PM Hui Fang <hui.fang@nxp.com> wrote:
>
> On Tue, Sep 12, 2023 at 16:52 PM Tomasz Figa <tfiga@chromium.org> wrote:
> > Right. You may want to try modifying vb2_dma_sg_alloc_compacted() to use
> > dma_alloc_pages() instead of alloc_pages().
>
> Thanks for your suggestion, it works. And it's a better resolution since no need
> an extra copy from high buffer to low buffer.

Great to hear! Could you submit a patch? Would appreciate adding

Suggested-by: Tomasz Figa <tfiga@chromium.org>

above the Signed-off-by line if you don't mind. Thanks.

Best regards,
Tomasz
RE: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Hui Fang 2 years, 3 months ago
On Wed, Sep 13, 2023 at 6:44 PM Tomasz Figa <tfiga@chromium.org> wrote:
> Great to hear! Could you submit a patch? Would appreciate adding
> 
> Suggested-by: Tomasz Figa <tfiga@chromium.org>
> 
> above the Signed-off-by line if you don't mind. Thanks.

Sure. Will verified on other different i.mx boards, then push.

BRs,
Fang Hui
RE: [EXT] Re: [PATCH] media: videobuf2-dma-sg: limit the sg segment size
Posted by Hui Fang 2 years, 3 months ago
> On Wed, Sep 13, 2023 at 21:17 PM Fang Hui <hui.fang@nxp.com > wrote:
> > above the Signed-off-by line if you don't mind. Thanks.
> 
> Sure. Will verified on other different i.mx boards, then push.

Ref https://lore.kernel.org/all/20230914145812.12851-1-hui.fang@nxp.com/

BRs,
Fang Hui