From: Leon Romanovsky <leonro@nvidia.com>
Rename the DMA_ATTR_CPU_CACHE_CLEAN attribute to reflect that it allows
CPU cache overlaps to exist, and document a slightly different but still
valid use case involving overlapping CPU cache lines.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
Documentation/core-api/dma-attributes.rst | 26 ++++++++++++++++++--------
drivers/virtio/virtio_ring.c | 4 ++--
include/linux/dma-mapping.h | 8 ++++----
kernel/dma/debug.c | 2 +-
4 files changed, 25 insertions(+), 15 deletions(-)
diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 1d7bfad73b1c7..6b73d92c62721 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -149,11 +149,21 @@ For architectures that require cache flushing for DMA coherence
DMA_ATTR_MMIO will not perform any cache flushing. The address
provided must never be mapped cacheable into the CPU.
-DMA_ATTR_CPU_CACHE_CLEAN
-------------------------
-
-This attribute indicates the CPU will not dirty any cacheline overlapping this
-DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows
-multiple small buffers to safely share a cacheline without risk of data
-corruption, suppressing DMA debug warnings about overlapping mappings.
-All mappings sharing a cacheline should have this attribute.
+DMA_ATTR_CPU_CACHE_OVERLAP
+--------------------------
+
+This attribute indicates that CPU cache lines may overlap for buffers mapped
+with DMA_FROM_DEVICE or DMA_BIDIRECTIONAL.
+
+Such overlap may occur when callers map multiple small buffers that reside
+within the same cache line. In this case, callers must guarantee that the CPU
+will not dirty these cache lines after the mappings are established. When this
+condition is met, multiple buffers can safely share a cache line without risking
+data corruption.
+
+Another valid use case is on systems that are CPU-coherent and do not use
+SWIOTLB, where the caller can guarantee that no cache maintenance operations
+(such as flushes) will be performed that could overwrite shared cache lines.
+
+All mappings that share a cache line must set this attribute to suppress DMA
+debug warnings about overlapping mappings.
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 335692d41617a..bf51ae9a39169 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2912,7 +2912,7 @@ EXPORT_SYMBOL_GPL(virtqueue_add_inbuf);
* @data: the token identifying the buffer.
* @gfp: how to do memory allocations (if necessary).
*
- * Same as virtqueue_add_inbuf but passes DMA_ATTR_CPU_CACHE_CLEAN to indicate
+ * Same as virtqueue_add_inbuf but passes DMA_ATTR_CPU_CACHE_OVERLAP to indicate
* that the CPU will not dirty any cacheline overlapping this buffer while it
* is available, and to suppress overlapping cacheline warnings in DMA debug
* builds.
@@ -2928,7 +2928,7 @@ int virtqueue_add_inbuf_cache_clean(struct virtqueue *vq,
gfp_t gfp)
{
return virtqueue_add(vq, &sg, num, 0, 1, data, NULL, false, gfp,
- DMA_ATTR_CPU_CACHE_CLEAN);
+ DMA_ATTR_CPU_CACHE_OVERLAP);
}
EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_cache_clean);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 29973baa05816..45efede1a6cce 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -80,11 +80,11 @@
#define DMA_ATTR_MMIO (1UL << 10)
/*
- * DMA_ATTR_CPU_CACHE_CLEAN: Indicates the CPU will not dirty any cacheline
- * overlapping this buffer while it is mapped for DMA. All mappings sharing
- * a cacheline must have this attribute for this to be considered safe.
+ * DMA_ATTR_CPU_CACHE_OVERLAP: Indicates the CPU cache line can be overlapped.
+ * All mappings sharing a cacheline must have this attribute for this
+ * to be considered safe.
*/
-#define DMA_ATTR_CPU_CACHE_CLEAN (1UL << 11)
+#define DMA_ATTR_CPU_CACHE_OVERLAP (1UL << 11)
/*
* A dma_addr_t can hold any valid DMA or bus address for the platform. It can
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index be207be749968..603be342063f1 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -601,7 +601,7 @@ static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
unsigned long flags;
int rc;
- entry->is_cache_clean = !!(attrs & DMA_ATTR_CPU_CACHE_CLEAN);
+ entry->is_cache_clean = attrs & DMA_ATTR_CPU_CACHE_OVERLAP;
bucket = get_hash_bucket(entry, &flags);
hash_bucket_add(bucket, entry);
--
2.53.0
On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote: > -This attribute indicates the CPU will not dirty any cacheline overlapping this > -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows > -multiple small buffers to safely share a cacheline without risk of data > -corruption, suppressing DMA debug warnings about overlapping mappings. > -All mappings sharing a cacheline should have this attribute. > +DMA_ATTR_CPU_CACHE_OVERLAP This is a very specific and well defined use case that allows some cache flushing behaviors to work only under the promise that the CPU doesn't touch the memory to cause cache inconsistencies. > +Another valid use case is on systems that are CPU-coherent and do not use > +SWIOTLB, where the caller can guarantee that no cache maintenance operations > +(such as flushes) will be performed that could overwrite shared cache lines. This is something completely unrelated. What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which fails any mappings requests that would use any SWIOTLB or cache flushing. It should only be used by callers like RDMA/DRM/etc where they have historical uAPI that has never supported incoherent DMA operation and are an exception to the normal DMA API requirements. The problem is to limit the use of that flag to only a few approved places. I fear adding such a flag wide open would open the door to widespread driver abuse. These days we have 'export symbol for module' so maybe there is a way to do it with safety? I'd really like this right now because CC systems are forcing SWIOTLB and things like RDMA userspace are unfixably broken with SWIOTLB. The uAPI it has simply cannot work with it. I'd much rather to immediate fail than suffer data corruption. Jiri was looking at adding some hacky "is cc" check, but I'd far prefer a proper flag that covered all the uAPI breaking cases. Jason
On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote: > On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote: > > > -This attribute indicates the CPU will not dirty any cacheline overlapping this > > -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows > > -multiple small buffers to safely share a cacheline without risk of data > > -corruption, suppressing DMA debug warnings about overlapping mappings. > > -All mappings sharing a cacheline should have this attribute. > > +DMA_ATTR_CPU_CACHE_OVERLAP > > This is a very specific and well defined use case that allows some cache > flushing behaviors to work only under the promise that the CPU doesn't > touch the memory to cause cache inconsistencies. > > > +Another valid use case is on systems that are CPU-coherent and do not use > > +SWIOTLB, where the caller can guarantee that no cache maintenance operations > > +(such as flushes) will be performed that could overwrite shared cache lines. > > This is something completely unrelated. I disagree. The situation is equivalent in that callers guarantee the CPU cache will not be overwritten. For the RDMA case, this results in the same behavior as with virtio. For our case, it addresses and clears the debug warnings. > > What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which > fails any mappings requests that would use any SWIOTLB or cache > flushing. You are proposing something orthogonal that operates at a different layer (DMA mapping). However, for DMA debugging, your new attribute will be equivalent to DMA_ATTR_CPU_CACHE_OVERLAP. Thanks
On Sun, Mar 08, 2026 at 08:49:02PM +0200, Leon Romanovsky wrote: > On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote: > > On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote: > > > > > -This attribute indicates the CPU will not dirty any cacheline overlapping this > > > -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows > > > -multiple small buffers to safely share a cacheline without risk of data > > > -corruption, suppressing DMA debug warnings about overlapping mappings. > > > -All mappings sharing a cacheline should have this attribute. > > > +DMA_ATTR_CPU_CACHE_OVERLAP > > > > This is a very specific and well defined use case that allows some cache > > flushing behaviors to work only under the promise that the CPU doesn't > > touch the memory to cause cache inconsistencies. > > > > > +Another valid use case is on systems that are CPU-coherent and do not use > > > +SWIOTLB, where the caller can guarantee that no cache maintenance operations > > > +(such as flushes) will be performed that could overwrite shared cache lines. > > > > This is something completely unrelated. > > I disagree. The situation is equivalent in that callers guarantee the > CPU cache will not be overwritten. The RDMA callers do no such thing, they just don't work at all if there is non-coherence in the mapping which is why it is not a bug. virtio looks like it does actually keep the caches clean for different mappings (and probably also in practice forced coherent as well given qemu is coherent with the VM and VFIO doesn't allow non-coherent DMA devices) > > What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which > > fails any mappings requests that would use any SWIOTLB or cache > > flushing. > > You are proposing something orthogonal that operates at a different layer > (DMA mapping). However, for DMA debugging, your new attribute will be > equivalent to DMA_ATTR_CPU_CACHE_OVERLAP. DMA_ATTR is a dma mapping flag, if you want some weird dma debugging flag it should be called DMA_ATTR_DEBUGGING_IGNORE_CACHELINES with some kind of statement at the user why it is OK. Jason
On Sun, Mar 08, 2026 at 08:09:16PM -0300, Jason Gunthorpe wrote: > On Sun, Mar 08, 2026 at 08:49:02PM +0200, Leon Romanovsky wrote: > > On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote: > > > On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote: > > > > > > > -This attribute indicates the CPU will not dirty any cacheline overlapping this > > > > -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows > > > > -multiple small buffers to safely share a cacheline without risk of data > > > > -corruption, suppressing DMA debug warnings about overlapping mappings. > > > > -All mappings sharing a cacheline should have this attribute. > > > > +DMA_ATTR_CPU_CACHE_OVERLAP > > > > > > This is a very specific and well defined use case that allows some cache > > > flushing behaviors to work only under the promise that the CPU doesn't > > > touch the memory to cause cache inconsistencies. > > > > > > > +Another valid use case is on systems that are CPU-coherent and do not use > > > > +SWIOTLB, where the caller can guarantee that no cache maintenance operations > > > > +(such as flushes) will be performed that could overwrite shared cache lines. > > > > > > This is something completely unrelated. > > > > I disagree. The situation is equivalent in that callers guarantee the > > CPU cache will not be overwritten. > > The RDMA callers do no such thing, they just don't work at all if > there is non-coherence in the mapping which is why it is not a bug. > > virtio looks like it does actually keep the caches clean for different > mappings (and probably also in practice forced coherent as well given > qemu is coherent with the VM and VFIO doesn't allow non-coherent DMA > devices) > > > > What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which > > > fails any mappings requests that would use any SWIOTLB or cache > > > flushing. > > > > You are proposing something orthogonal that operates at a different layer > > (DMA mapping). However, for DMA debugging, your new attribute will be > > equivalent to DMA_ATTR_CPU_CACHE_OVERLAP. > > DMA_ATTR is a dma mapping flag, if you want some weird dma debugging > flag it should be called DMA_ATTR_DEBUGGING_IGNORE_CACHELINES with > some kind of statement at the user why it is OK. And this is the issue: the existing DMA_ATTR_CPU_CACHE_CLEAN is essentially a debug-oriented attribute. The upper layers are already handled through __dma_from_device_group_begin()/end(), which pad cache lines on non-coherent systems. Marek, What do you see as the right path forward here? RDMA has a legitimate use case where CPU cache lines may overlap. The underlying reason differs from VirtIO, but the outcome is the same. Should I keep the current name? Should we rename it to the proposed DMA_ATTR_CPU_CACHE_OVERLAP or DMA_ATTR_DEBUGGING_IGNORE_CACHELINES? Should we introduce a new DMA_ATTR_REQUIRE_COHERENT attribute instead? Or do you have another recommendation? Thanks > > Jason
On 09.03.2026 10:03, Leon Romanovsky wrote: > On Sun, Mar 08, 2026 at 08:09:16PM -0300, Jason Gunthorpe wrote: >> On Sun, Mar 08, 2026 at 08:49:02PM +0200, Leon Romanovsky wrote: >>> On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote: >>>> On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote: >>>> >>>>> -This attribute indicates the CPU will not dirty any cacheline overlapping this >>>>> -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows >>>>> -multiple small buffers to safely share a cacheline without risk of data >>>>> -corruption, suppressing DMA debug warnings about overlapping mappings. >>>>> -All mappings sharing a cacheline should have this attribute. >>>>> +DMA_ATTR_CPU_CACHE_OVERLAP >>>> This is a very specific and well defined use case that allows some cache >>>> flushing behaviors to work only under the promise that the CPU doesn't >>>> touch the memory to cause cache inconsistencies. >>>> >>>>> +Another valid use case is on systems that are CPU-coherent and do not use >>>>> +SWIOTLB, where the caller can guarantee that no cache maintenance operations >>>>> +(such as flushes) will be performed that could overwrite shared cache lines. >>>> This is something completely unrelated. >>> I disagree. The situation is equivalent in that callers guarantee the >>> CPU cache will not be overwritten. >> The RDMA callers do no such thing, they just don't work at all if >> there is non-coherence in the mapping which is why it is not a bug. >> >> virtio looks like it does actually keep the caches clean for different >> mappings (and probably also in practice forced coherent as well given >> qemu is coherent with the VM and VFIO doesn't allow non-coherent DMA >> devices) >> >>>> What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which >>>> fails any mappings requests that would use any SWIOTLB or cache >>>> flushing. >>> You are proposing something orthogonal that operates at a different layer >>> (DMA mapping). However, for DMA debugging, your new attribute will be >>> equivalent to DMA_ATTR_CPU_CACHE_OVERLAP. >> DMA_ATTR is a dma mapping flag, if you want some weird dma debugging >> flag it should be called DMA_ATTR_DEBUGGING_IGNORE_CACHELINES with >> some kind of statement at the user why it is OK. > And this is the issue: the existing DMA_ATTR_CPU_CACHE_CLEAN is essentially > a debug-oriented attribute. The upper layers are already handled through > __dma_from_device_group_begin()/end(), which pad cache lines on > non-coherent systems. > > Marek, > > What do you see as the right path forward here? RDMA has a legitimate use > case where CPU cache lines may overlap. The underlying reason differs from > VirtIO, but the outcome is the same. Should I keep the current name? Should > we rename it to the proposed DMA_ATTR_CPU_CACHE_OVERLAP or > DMA_ATTR_DEBUGGING_IGNORE_CACHELINES? Should we introduce a new > DMA_ATTR_REQUIRE_COHERENT attribute instead? Or do you have another > recommendation? My question here is if RDMA works on any non-coherent DMA systems? If not then it should fail early (during init or probe?) to avoid potential data corruption and new DMA attributes won't help it. On the other hand, theDMA_ATTR_CPU_CACHE_OVERLAP attribute is a bit more descriptive to me than DMA_ATTR_CPU_CACHE_CLEAN, but this indeed looks like a separate issue from the RDMA case. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On Mon, Mar 09, 2026 at 01:30:24PM +0100, Marek Szyprowski wrote: > On 09.03.2026 10:03, Leon Romanovsky wrote: > > On Sun, Mar 08, 2026 at 08:09:16PM -0300, Jason Gunthorpe wrote: > >> On Sun, Mar 08, 2026 at 08:49:02PM +0200, Leon Romanovsky wrote: > >>> On Sun, Mar 08, 2026 at 03:19:20PM -0300, Jason Gunthorpe wrote: > >>>> On Sat, Mar 07, 2026 at 06:49:56PM +0200, Leon Romanovsky wrote: > >>>> > >>>>> -This attribute indicates the CPU will not dirty any cacheline overlapping this > >>>>> -DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows > >>>>> -multiple small buffers to safely share a cacheline without risk of data > >>>>> -corruption, suppressing DMA debug warnings about overlapping mappings. > >>>>> -All mappings sharing a cacheline should have this attribute. > >>>>> +DMA_ATTR_CPU_CACHE_OVERLAP > >>>> This is a very specific and well defined use case that allows some cache > >>>> flushing behaviors to work only under the promise that the CPU doesn't > >>>> touch the memory to cause cache inconsistencies. > >>>> > >>>>> +Another valid use case is on systems that are CPU-coherent and do not use > >>>>> +SWIOTLB, where the caller can guarantee that no cache maintenance operations > >>>>> +(such as flushes) will be performed that could overwrite shared cache lines. > >>>> This is something completely unrelated. > >>> I disagree. The situation is equivalent in that callers guarantee the > >>> CPU cache will not be overwritten. > >> The RDMA callers do no such thing, they just don't work at all if > >> there is non-coherence in the mapping which is why it is not a bug. > >> > >> virtio looks like it does actually keep the caches clean for different > >> mappings (and probably also in practice forced coherent as well given > >> qemu is coherent with the VM and VFIO doesn't allow non-coherent DMA > >> devices) > >> > >>>> What I would really like is a new DMA_ATTR_REQUIRE_COHERENT which > >>>> fails any mappings requests that would use any SWIOTLB or cache > >>>> flushing. > >>> You are proposing something orthogonal that operates at a different layer > >>> (DMA mapping). However, for DMA debugging, your new attribute will be > >>> equivalent to DMA_ATTR_CPU_CACHE_OVERLAP. > >> DMA_ATTR is a dma mapping flag, if you want some weird dma debugging > >> flag it should be called DMA_ATTR_DEBUGGING_IGNORE_CACHELINES with > >> some kind of statement at the user why it is OK. > > And this is the issue: the existing DMA_ATTR_CPU_CACHE_CLEAN is essentially > > a debug-oriented attribute. The upper layers are already handled through > > __dma_from_device_group_begin()/end(), which pad cache lines on > > non-coherent systems. > > > > Marek, > > > > What do you see as the right path forward here? RDMA has a legitimate use > > case where CPU cache lines may overlap. The underlying reason differs from > > VirtIO, but the outcome is the same. Should I keep the current name? Should > > we rename it to the proposed DMA_ATTR_CPU_CACHE_OVERLAP or > > DMA_ATTR_DEBUGGING_IGNORE_CACHELINES? Should we introduce a new > > DMA_ATTR_REQUIRE_COHERENT attribute instead? Or do you have another > > recommendation? > > My question here is if RDMA works on any non-coherent DMA systems? If > not then it should fail early (during init or probe?) to avoid potential > data corruption and new DMA attributes won't help it. Like Jason wrote, our user‑visible API does not work on non‑coherent systems, and this is where I'm using the DMA_ATTR_CPU_CACHE_OVERLAP attribute. Regarding failure on unsupported systems, I have tried more than once to make the RDMA fail when the device is known to take the SWIOTLB path in RDMA and cannot operate correctly, but each attempt was met with a cold reception: https://lore.kernel.org/all/d18c454636bf3cfdba9b66b7cc794d713eadc4a5.1719909395.git.leon@kernel.org/ I'm afraid the outcome will be the same this time as well. > On the other hand, the DMA_ATTR_CPU_CACHE_OVERLAP attribute is a bit more > descriptive to me than DMA_ATTR_CPU_CACHE_CLEAN, but this indeed looks > like a separate issue from the RDMA case. > > Best regards > -- > Marek Szyprowski, PhD > Samsung R&D Institute Poland > >
On Mon, Mar 09, 2026 at 05:05:02PM +0200, Leon Romanovsky wrote: > Regarding failure on unsupported systems, I have tried more than once to > make the RDMA fail when the device is known to take the SWIOTLB path > in RDMA and cannot operate correctly, but each attempt was met with a > cold reception: > https://lore.kernel.org/all/d18c454636bf3cfdba9b66b7cc794d713eadc4a5.1719909395.git.leon@kernel.org/ I think alot of that is the APIs used there. It is hard to determine if SWIOTLB is possible or coherent is possible, I've also hit these things in VFIO and gave up. However, DMA_ATTR_REQUIRE_COHERENCE can be done properly and not leak alot of dangerous APIs to drivers (beyond itself). It is also more important now with CC systems, I think. Jason
On 09.03.2026 16:13, Jason Gunthorpe wrote: > On Mon, Mar 09, 2026 at 05:05:02PM +0200, Leon Romanovsky wrote: >> Regarding failure on unsupported systems, I have tried more than once to >> make the RDMA fail when the device is known to take the SWIOTLB path >> in RDMA and cannot operate correctly, but each attempt was met with a >> cold reception: >> https://lore.kernel.org/all/d18c454636bf3cfdba9b66b7cc794d713eadc4a5.1719909395.git.leon@kernel.org/ > I think alot of that is the APIs used there. It is hard to determine > if SWIOTLB is possible or coherent is possible, I've also hit these > things in VFIO and gave up. > > However, DMA_ATTR_REQUIRE_COHERENCE can be done properly and not leak > alot of dangerous APIs to drivers (beyond itself). > > It is also more important now with CC systems, I think. Jason is right. Indeed the rdma/uverbs case needs some extension to ensure that the coherent mapping is used, what is not possible now. This however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed for that use case too. I'm open to accept both. The only question I have is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN, while DMA_ATTR_CPU_CACHE_OVERLAP and DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems to be most descriptive. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On Tue, Mar 10, 2026 at 10:45:38AM +0100, Marek Szyprowski wrote: > Jason is right. Indeed the rdma/uverbs case needs some extension to > ensure that the coherent mapping is used, what is not possible now. This > however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed > for that use case too. I'm open to accept both. The only question I have > is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN, > while DMA_ATTR_CPU_CACHE_OVERLAP and > DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems > to be most descriptive. If we do DMA_ATTR_REQUIRE_COHERENCE then I imagine it would internally also set DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, but I'd prefer that detail not leak into the callers. Jason
On 10.03.2026 13:34, Jason Gunthorpe wrote: > On Tue, Mar 10, 2026 at 10:45:38AM +0100, Marek Szyprowski wrote: >> Jason is right. Indeed the rdma/uverbs case needs some extension to >> ensure that the coherent mapping is used, what is not possible now. This >> however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed >> for that use case too. I'm open to accept both. The only question I have >> is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN, >> while DMA_ATTR_CPU_CACHE_OVERLAP and >> DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems >> to be most descriptive. > If we do DMA_ATTR_REQUIRE_COHERENCE then I imagine it would internally > also set DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, but I'd prefer that > detail not leak into the callers. Why DMA_ATTR_REQUIRE_COHERENCE should imply DMA_ATTR_DEBUGGING_IGNORE_CACHELINES? Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On Tue, Mar 10, 2026 at 10:08:38PM +0100, Marek Szyprowski wrote: > On 10.03.2026 13:34, Jason Gunthorpe wrote: > > On Tue, Mar 10, 2026 at 10:45:38AM +0100, Marek Szyprowski wrote: > >> Jason is right. Indeed the rdma/uverbs case needs some extension to > >> ensure that the coherent mapping is used, what is not possible now. This > >> however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed > >> for that use case too. I'm open to accept both. The only question I have > >> is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN, > >> while DMA_ATTR_CPU_CACHE_OVERLAP and > >> DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems > >> to be most descriptive. > > If we do DMA_ATTR_REQUIRE_COHERENCE then I imagine it would internally > > also set DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, but I'd prefer that > > detail not leak into the callers. > > Why DMA_ATTR_REQUIRE_COHERENCE should imply > DMA_ATTR_DEBUGGING_IGNORE_CACHELINES? AFAICT the purpose of the DMA API debugging cacheline tracking is to ensure that drivers are mapping things properly such that the cache flushing in incoherent systems can properly cache flush them without creating bugs (ie a dirty line overwriteing DMA'd data or something). If the mapping is REQUIRE_COHERENCE then it is prevented from running on systems where these cache artifacts can cause corruption, so we don't need to track them and we don't need the strict restrictions on what can be mapped. Which trips up and gives false positives for cases like RDMA, DRM, etc that are allowing userspace to multi-map userspace memory. Jason
On Tue, Mar 10, 2026 at 09:34:05AM -0300, Jason Gunthorpe wrote: > On Tue, Mar 10, 2026 at 10:45:38AM +0100, Marek Szyprowski wrote: > > Jason is right. Indeed the rdma/uverbs case needs some extension to > > ensure that the coherent mapping is used, what is not possible now. This > > however doesn't mean that the DMA_ATTR_CPU_CACHE_OVERLAP is not needed > > for that use case too. I'm open to accept both. The only question I have > > is which name should we use? We already have DMA_ATTR_CPU_CACHE_CLEAN, > > while DMA_ATTR_CPU_CACHE_OVERLAP and > > DMA_ATTR_DEBUGGING_IGNORE_CACHELINES were proposed here. The last seems > > to be most descriptive. > > If we do DMA_ATTR_REQUIRE_COHERENCE then I imagine it would internally > also set DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, but I'd prefer that > detail not leak into the callers. Yes, this is how I implemented in my v2, which I didn't send yet :). Thanks > > Jason
On Mon, Mar 09, 2026 at 01:30:24PM +0100, Marek Szyprowski wrote: > My question here is if RDMA works on any non-coherent DMA systems? The in kernel components do work, like storage, nvme over fabrics, netdev. The user API (uverbs) does not work at all, and has never worked. I think DRM has similar issues too where most of their DMA API usage is OK but some places where they interact win pin_user_pages() have the same issues as RDMA. This is why I'd like a new attribute DMA_ATTR_REQUIRE_COHERENCE that these special cases can use to fail instead of data corrupt. Jason
© 2016 - 2026 Red Hat, Inc.