[PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig

Mikhail Rudenko posted 2 patches 1 year ago
There is a newer version of this series
[PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
Posted by Mikhail Rudenko 1 year ago
When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in
commit 129134e5415d ("media: media/v4l2: remove
V4L2_FLAG_MEMORY_NON_CONSISTENT flag"),
vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made
no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was
introduced in commit c0acf9cfeee0 ("media: videobuf2: handle
V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained
no-ops, making cache maintenance for non-coherent dmabufs allocated by
dma-contig impossible.

Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and
{flush,invalidate}_kernel_vmap_range calls to
vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent
buffers.

Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag")
Cc: stable@vger.kernel.org
Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
---
 .../media/common/videobuf2/videobuf2-dma-contig.c  | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28ce0641fe726e63 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -427,6 +427,17 @@ static int
 vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
 				   enum dma_data_direction direction)
 {
+	struct vb2_dc_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	if (!buf->non_coherent_mem)
+		return 0;
+
+	if (buf->vaddr)
+		invalidate_kernel_vmap_range(buf->vaddr, buf->size);
+
+	dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
+
 	return 0;
 }
 
@@ -434,6 +445,17 @@ static int
 vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
 				 enum dma_data_direction direction)
 {
+	struct vb2_dc_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	if (!buf->non_coherent_mem)
+		return 0;
+
+	if (buf->vaddr)
+		flush_kernel_vmap_range(buf->vaddr, buf->size);
+
+	dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
+
 	return 0;
 }
 

-- 
2.47.1
Re: [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
Posted by Nicolas Dufresne 11 months, 2 weeks ago
Le mardi 28 janvier 2025 à 23:35 +0300, Mikhail Rudenko a écrit :
> When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in
> commit 129134e5415d ("media: media/v4l2: remove
> V4L2_FLAG_MEMORY_NON_CONSISTENT flag"),
> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made
> no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was
> introduced in commit c0acf9cfeee0 ("media: videobuf2: handle
> V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained
> no-ops, making cache maintenance for non-coherent dmabufs allocated
> by
> dma-contig impossible.
> 
> Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and
> {flush,invalidate}_kernel_vmap_range calls to
> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent
> buffers.
> 
> Fixes: c0acf9cfeee0 ("media: videobuf2: handle
> V4L2_MEMORY_FLAG_NON_COHERENT flag")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
> ---
>  .../media/common/videobuf2/videobuf2-dma-contig.c  | 22
> ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index
> bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28c
> e0641fe726e63 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -427,6 +427,17 @@ static int
>  vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
>  				   enum dma_data_direction
> direction)
>  {
> +	struct vb2_dc_buf *buf = dbuf->priv;
> +	struct sg_table *sgt = buf->dma_sgt;
> +
> +	if (!buf->non_coherent_mem)
> +		return 0;
> +
> +	if (buf->vaddr)
> +		invalidate_kernel_vmap_range(buf->vaddr, buf->size);

Am I correct that this is mostly to prevent the kernel from reading
back old data from the cache after an application or other driver did
CPU writes ? If so, can't we restrict that to DMA_TO_DEVICE and
DMA_BIDIRECTIONAL ?

As for pending kernel writes, they should have been flushed before the
buffer is made available for dequeue. And any access while a buffer is
queued is concurrent access, which is expected to have undefined
behaviour.

> +
> +	dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);

Isn't there a link to make between buf->dma_dir and direcction before
calling this ? Also, shouldn't we use direction insead of buf->dma_dir
to possibly limit the scope ?

> +
>  	return 0;
>  }
>  
> @@ -434,6 +445,17 @@ static int
>  vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
>  				 enum dma_data_direction direction)
>  {
> +	struct vb2_dc_buf *buf = dbuf->priv;
> +	struct sg_table *sgt = buf->dma_sgt;
> +
> +	if (!buf->non_coherent_mem)
> +		return 0;
> +
> +	if (buf->vaddr)
> +		flush_kernel_vmap_range(buf->vaddr, buf->size);
> +
> +	dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);

Similar questions for the end_cpu_access implementation.

Nicolas

> +
>  	return 0;
>  }
>  
> 
Re: [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
Posted by Tomasz Figa 11 months, 2 weeks ago
On Fri, Feb 28, 2025 at 12:30 PM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le mardi 28 janvier 2025 à 23:35 +0300, Mikhail Rudenko a écrit :
> > When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in
> > commit 129134e5415d ("media: media/v4l2: remove
> > V4L2_FLAG_MEMORY_NON_CONSISTENT flag"),
> > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made
> > no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was
> > introduced in commit c0acf9cfeee0 ("media: videobuf2: handle
> > V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained
> > no-ops, making cache maintenance for non-coherent dmabufs allocated
> > by
> > dma-contig impossible.
> >
> > Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and
> > {flush,invalidate}_kernel_vmap_range calls to
> > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent
> > buffers.
> >
> > Fixes: c0acf9cfeee0 ("media: videobuf2: handle
> > V4L2_MEMORY_FLAG_NON_COHERENT flag")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
> > ---
> >  .../media/common/videobuf2/videobuf2-dma-contig.c  | 22
> > ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> > b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> > index
> > bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28c
> > e0641fe726e63 100644
> > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> > @@ -427,6 +427,17 @@ static int
> >  vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
> >                                  enum dma_data_direction
> > direction)
> >  {
> > +     struct vb2_dc_buf *buf = dbuf->priv;
> > +     struct sg_table *sgt = buf->dma_sgt;
> > +
> > +     if (!buf->non_coherent_mem)
> > +             return 0;
> > +
> > +     if (buf->vaddr)
> > +             invalidate_kernel_vmap_range(buf->vaddr, buf->size);
>
> Am I correct that this is mostly to prevent the kernel from reading
> back old data from the cache after an application or other driver did
> CPU writes ? If so, can't we restrict that to DMA_TO_DEVICE and
> DMA_BIDIRECTIONAL ?

Note that this function must also synchronize between the user-space
and kernel mappings, where the DMA direction doesn't really matter.
Also it's unlikely for it to be called when not needed - why would one
begin a CPU access before the DMA, when the DMA is FROM_DEVICE?

>
> As for pending kernel writes, they should have been flushed before the
> buffer is made available for dequeue.

There is no implicit flushing for imported DMA-bufs. All the flushing
needs to be executed directly by the CPU accessors by surrounding the
access with begin and end CPU access, be it in the kernel or
userspace.

> And any access while a buffer is
> queued is concurrent access, which is expected to have undefined
> behaviour.
>

Correct.

> > +
> > +     dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
>
> Isn't there a link to make between buf->dma_dir and direcction before
> calling this ? Also, shouldn't we use direction insead of buf->dma_dir
> to possibly limit the scope ?

Oh, yes, that's a good catch. It should be |direction| passed here and
not |buf->dma_dir|, since the former determines what CPU access will
be done.

>
> > +
> >       return 0;
> >  }
> >
> > @@ -434,6 +445,17 @@ static int
> >  vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
> >                                enum dma_data_direction direction)
> >  {
> > +     struct vb2_dc_buf *buf = dbuf->priv;
> > +     struct sg_table *sgt = buf->dma_sgt;
> > +
> > +     if (!buf->non_coherent_mem)
> > +             return 0;
> > +
> > +     if (buf->vaddr)
> > +             flush_kernel_vmap_range(buf->vaddr, buf->size);
> > +
> > +     dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
>
> Similar questions for the end_cpu_access implementation.

Yeah, same here.

>
> Nicolas
>
> > +
> >       return 0;
> >  }
> >
> >
>
Re: [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
Posted by Mikhail Rudenko 11 months, 2 weeks ago
On 2025-02-28 at 19:25 +09, Tomasz Figa <tfiga@chromium.org> wrote:

> On Fri, Feb 28, 2025 at 12:30 PM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>>
>> Le mardi 28 janvier 2025 à 23:35 +0300, Mikhail Rudenko a écrit :
>> > When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in
>> > commit 129134e5415d ("media: media/v4l2: remove
>> > V4L2_FLAG_MEMORY_NON_CONSISTENT flag"),
>> > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made
>> > no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was
>> > introduced in commit c0acf9cfeee0 ("media: videobuf2: handle
>> > V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained
>> > no-ops, making cache maintenance for non-coherent dmabufs allocated
>> > by
>> > dma-contig impossible.
>> >
>> > Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and
>> > {flush,invalidate}_kernel_vmap_range calls to
>> > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent
>> > buffers.
>> >
>> > Fixes: c0acf9cfeee0 ("media: videobuf2: handle
>> > V4L2_MEMORY_FLAG_NON_COHERENT flag")
>> > Cc: stable@vger.kernel.org
>> > Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
>> > ---
>> >  .../media/common/videobuf2/videobuf2-dma-contig.c  | 22
>> > ++++++++++++++++++++++
>> >  1 file changed, 22 insertions(+)
>> >
>> > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> > b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> > index
>> > bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28c
>> > e0641fe726e63 100644
>> > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> > @@ -427,6 +427,17 @@ static int
>> >  vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
>> >                                  enum dma_data_direction
>> > direction)
>> >  {
>> > +     struct vb2_dc_buf *buf = dbuf->priv;
>> > +     struct sg_table *sgt = buf->dma_sgt;
>> > +
>> > +     if (!buf->non_coherent_mem)
>> > +             return 0;
>> > +
>> > +     if (buf->vaddr)
>> > +             invalidate_kernel_vmap_range(buf->vaddr, buf->size);
>>
>> Am I correct that this is mostly to prevent the kernel from reading
>> back old data from the cache after an application or other driver did
>> CPU writes ? If so, can't we restrict that to DMA_TO_DEVICE and
>> DMA_BIDIRECTIONAL ?
>
> Note that this function must also synchronize between the user-space
> and kernel mappings, where the DMA direction doesn't really matter.
> Also it's unlikely for it to be called when not needed - why would one
> begin a CPU access before the DMA, when the DMA is FROM_DEVICE?
>
>>
>> As for pending kernel writes, they should have been flushed before the
>> buffer is made available for dequeue.
>
> There is no implicit flushing for imported DMA-bufs. All the flushing
> needs to be executed directly by the CPU accessors by surrounding the
> access with begin and end CPU access, be it in the kernel or
> userspace.
>
>> And any access while a buffer is
>> queued is concurrent access, which is expected to have undefined
>> behaviour.
>>
>
> Correct.
>
>> > +
>> > +     dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
>>
>> Isn't there a link to make between buf->dma_dir and direcction before
>> calling this ? Also, shouldn't we use direction insead of buf->dma_dir
>> to possibly limit the scope ?
>
> Oh, yes, that's a good catch.

Indeed, will fix this in v4.

> It should be |direction| passed here and
> not |buf->dma_dir|, since the former determines what CPU access will
> be done.
>
>>
>> > +
>> >       return 0;
>> >  }
>> >
>> > @@ -434,6 +445,17 @@ static int
>> >  vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
>> >                                enum dma_data_direction direction)
>> >  {
>> > +     struct vb2_dc_buf *buf = dbuf->priv;
>> > +     struct sg_table *sgt = buf->dma_sgt;
>> > +
>> > +     if (!buf->non_coherent_mem)
>> > +             return 0;
>> > +
>> > +     if (buf->vaddr)
>> > +             flush_kernel_vmap_range(buf->vaddr, buf->size);
>> > +
>> > +     dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
>>
>> Similar questions for the end_cpu_access implementation.
>
> Yeah, same here.

Noted, will fix.

>>
>> Nicolas
>>
>> > +
>> >       return 0;
>> >  }
>> >
>> >
>>


--
Best regards,
Mikhail Rudenko
Re: [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
Posted by Tomasz Figa 1 year ago
On Tue, Jan 28, 2025 at 9:36 PM Mikhail Rudenko <mike.rudenko@gmail.com> wrote:
>
> When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in
> commit 129134e5415d ("media: media/v4l2: remove
> V4L2_FLAG_MEMORY_NON_CONSISTENT flag"),
> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made
> no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was
> introduced in commit c0acf9cfeee0 ("media: videobuf2: handle
> V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained
> no-ops, making cache maintenance for non-coherent dmabufs allocated by
> dma-contig impossible.
>
> Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and
> {flush,invalidate}_kernel_vmap_range calls to
> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent
> buffers.
>
> Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
> ---
>  .../media/common/videobuf2/videobuf2-dma-contig.c  | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28ce0641fe726e63 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -427,6 +427,17 @@ static int
>  vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
>                                    enum dma_data_direction direction)
>  {
> +       struct vb2_dc_buf *buf = dbuf->priv;
> +       struct sg_table *sgt = buf->dma_sgt;
> +
> +       if (!buf->non_coherent_mem)
> +               return 0;
> +
> +       if (buf->vaddr)
> +               invalidate_kernel_vmap_range(buf->vaddr, buf->size);
> +
> +       dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
> +
>         return 0;
>  }
>
> @@ -434,6 +445,17 @@ static int
>  vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
>                                  enum dma_data_direction direction)
>  {
> +       struct vb2_dc_buf *buf = dbuf->priv;
> +       struct sg_table *sgt = buf->dma_sgt;
> +
> +       if (!buf->non_coherent_mem)
> +               return 0;
> +
> +       if (buf->vaddr)
> +               flush_kernel_vmap_range(buf->vaddr, buf->size);
> +
> +       dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
> +
>         return 0;
>  }

I took some time (over)thinking the kernel vmap range synchronization,
because these functions can be called both from the kernel space using
respective dma_buf_*() kAPI and also from the user space using the
DMA_BUF_SYNC IOCTLs, so we could in theory have the multiple
invocations racing with each other, but then I realized that we don't
really provide any guarantees for concurrent writes and reads from the
CPU, so I believe this should work fine. Sorry for the delay.

Acked-by: Tomasz Figa <tfiga@chromium.org>

Let me add @Christoph Hellwig and @Robin Murphy just in case I'm wrong
on that, though... Hans, let's give them some time to take a look
before applying this.

Best regards,
Tomasz