Documentation/core-api/dma-api.rst | 4 +- arch/powerpc/kernel/dma-iommu.c | 4 +- drivers/iommu/dma-iommu.c | 14 +++---- drivers/virtio/virtio_ring.c | 4 +- include/linux/dma-map-ops.h | 8 ++-- include/linux/dma-mapping.h | 13 ++++++ include/linux/iommu-dma.h | 7 ++-- include/linux/kmsan.h | 12 +++--- include/trace/events/dma.h | 4 +- kernel/dma/debug.c | 28 ++++++++----- kernel/dma/debug.h | 16 ++++--- kernel/dma/direct.c | 6 +-- kernel/dma/direct.h | 13 +++--- kernel/dma/mapping.c | 67 +++++++++++++++++++++--------- kernel/dma/ops_helpers.c | 6 +-- mm/hmm.c | 8 ++-- mm/kmsan/hooks.c | 36 ++++++++++++---- tools/virtio/linux/kmsan.h | 2 +- 18 files changed, 159 insertions(+), 93 deletions(-)
This series refactors the DMA mapping to use physical addresses as the primary interface instead of page+offset parameters. This change aligns the DMA API with the underlying hardware reality where DMA operations work with physical addresses, not page structures. The series consists of 8 patches that progressively convert the DMA mapping infrastructure from page-based to physical address-based APIs: The series maintains backward compatibility by keeping the old page-based API as wrapper functions around the new physical address-based implementations. Thanks Leon Romanovsky (8): dma-debug: refactor to use physical addresses for page mapping dma-mapping: rename trace_dma_*map_page to trace_dma_*map_phys iommu/dma: rename iommu_dma_*map_page to iommu_dma_*map_phys dma-mapping: convert dma_direct_*map_page to be phys_addr_t based kmsan: convert kmsan_handle_dma to use physical addresses dma-mapping: fail early if physical address is mapped through platform callback dma-mapping: export new dma_*map_phys() interface mm/hmm: migrate to physical address-based DMA mapping API Documentation/core-api/dma-api.rst | 4 +- arch/powerpc/kernel/dma-iommu.c | 4 +- drivers/iommu/dma-iommu.c | 14 +++---- drivers/virtio/virtio_ring.c | 4 +- include/linux/dma-map-ops.h | 8 ++-- include/linux/dma-mapping.h | 13 ++++++ include/linux/iommu-dma.h | 7 ++-- include/linux/kmsan.h | 12 +++--- include/trace/events/dma.h | 4 +- kernel/dma/debug.c | 28 ++++++++----- kernel/dma/debug.h | 16 ++++--- kernel/dma/direct.c | 6 +-- kernel/dma/direct.h | 13 +++--- kernel/dma/mapping.c | 67 +++++++++++++++++++++--------- kernel/dma/ops_helpers.c | 6 +-- mm/hmm.c | 8 ++-- mm/kmsan/hooks.c | 36 ++++++++++++---- tools/virtio/linux/kmsan.h | 2 +- 18 files changed, 159 insertions(+), 93 deletions(-) -- 2.49.0
On 2025-06-25 2:18 pm, Leon Romanovsky wrote: > This series refactors the DMA mapping to use physical addresses > as the primary interface instead of page+offset parameters. This > change aligns the DMA API with the underlying hardware reality where > DMA operations work with physical addresses, not page structures. That is obvious nonsense - the DMA *API* does not exist in "hardware reality"; the DMA API abstracts *software* operations that must be performed before and after the actual hardware DMA operation in order to preserve memory coherency etc. Streaming DMA API callers get their buffers from alloc_pages() or kmalloc(); they do not have physical addresses, they have a page or virtual address. The internal operations of pretty much every DMA API implementation that isn't a no-op also require a page and/or virtual address. It is 100% logical for the DMA API interfaces to take a page or virtual address (and since virt_to_page() is pretty trivial, we already consolidated the two interfaces ages ago). Yes, once you get right down to the low-level arch_sync_dma_*() interfaces that passes a physical address, but that's mostly an artefact of them being factored out of old dma_sync_single_*() implementations that took a (physical) DMA address. Nearly all of them then use __va() or phys_to_virt() to actually consume it. Even though it's a phys_addr_t, the implicit guarantee that it represents page-backed memory is absolutely vital. Take a step back; what do you imagine that a DMA API call on a non-page-backed physical address could actually *do*? - Cache maintenance? No, it would be illogical for a P2P address to be cached in a CPU cache, and anyway it would almost always crash because it requires page-backed memory with a virtual address. - Bounce buffering? Again no, that would be illogical, defeat the entire point of a P2P operation, and anyway would definitely crash because it requires page-backed memory with a virtual address. - IOMMU mappings? Oh hey look that's exactly what dma_map_resource() has been doing for 9 years. Not to mention your new IOMMU API if callers want to be IOMMU-aware (although without the same guarantee of not also doing the crashy things.) - Debug tracking? Again, already taken care of by dma_map_resource(). - Some entirely new concept? Well, I'm eager to be enlightened if so! But given what we do already know of from decades of experience, obvious question: For the tiny minority of users who know full well when they're dealing with a non-page-backed physical address, what's wrong with using dma_map_resource? Does it make sense to try to consolidate our p2p infrastructure so dma_map_resource() could return bus addresses where appropriate? Yes, almost certainly, if it makes it more convenient to use. And with only about 20 users it's not too impractical to add some extra arguments or even rejig the whole interface if need be. Indeed an overhaul might even help solve the current grey area as to when it should take dma_range_map into account or not for platform devices. > The series consists of 8 patches that progressively convert the DMA > mapping infrastructure from page-based to physical address-based APIs: And as a result ends up making said DMA mapping infrastructure slightly more complicated and slightly less efficient for all its legitimate users, all so one or two highly specialised users can then pretend to call it in situations where it must be a no-op anyway? Please explain convincingly why that is not a giant waste of time. Are we trying to remove struct page from the kernel altogether? If yes, then for goodness' sake lead with that, but even then I'd still prefer to see the replacements for critical related infrastructure like pfn_valid() in place before we start trying to reshape the DMA API to fit. Thanks, Robin. > The series maintains backward compatibility by keeping the old > page-based API as wrapper functions around the new physical > address-based implementations. > > Thanks > > Leon Romanovsky (8): > dma-debug: refactor to use physical addresses for page mapping > dma-mapping: rename trace_dma_*map_page to trace_dma_*map_phys > iommu/dma: rename iommu_dma_*map_page to iommu_dma_*map_phys > dma-mapping: convert dma_direct_*map_page to be phys_addr_t based > kmsan: convert kmsan_handle_dma to use physical addresses > dma-mapping: fail early if physical address is mapped through platform > callback > dma-mapping: export new dma_*map_phys() interface > mm/hmm: migrate to physical address-based DMA mapping API > > Documentation/core-api/dma-api.rst | 4 +- > arch/powerpc/kernel/dma-iommu.c | 4 +- > drivers/iommu/dma-iommu.c | 14 +++---- > drivers/virtio/virtio_ring.c | 4 +- > include/linux/dma-map-ops.h | 8 ++-- > include/linux/dma-mapping.h | 13 ++++++ > include/linux/iommu-dma.h | 7 ++-- > include/linux/kmsan.h | 12 +++--- > include/trace/events/dma.h | 4 +- > kernel/dma/debug.c | 28 ++++++++----- > kernel/dma/debug.h | 16 ++++--- > kernel/dma/direct.c | 6 +-- > kernel/dma/direct.h | 13 +++--- > kernel/dma/mapping.c | 67 +++++++++++++++++++++--------- > kernel/dma/ops_helpers.c | 6 +-- > mm/hmm.c | 8 ++-- > mm/kmsan/hooks.c | 36 ++++++++++++---- > tools/virtio/linux/kmsan.h | 2 +- > 18 files changed, 159 insertions(+), 93 deletions(-) >
On Fri, Jul 25, 2025 at 09:05:46PM +0100, Robin Murphy wrote: > But given what we do already know of from decades of experience, obvious > question: For the tiny minority of users who know full well when they're > dealing with a non-page-backed physical address, what's wrong with using > dma_map_resource? I was also pushing for this, that we would have two seperate paths: - the phys_addr was guarenteed to have a KVA (and today also struct page) - the phys_addr is non-cachable and no KVA may exist This is basically already the distinction today between map resource and map page. The caller would have to look at what it is trying to map, do the P2P evaluation and then call the cachable phys or resource path(s). Leon, I think you should revive the work you had along these lines. It would address my concerns with the dma_ops changes too. I continue to think we should not push non-cachable, non-KVA MMIO down the map_page ops, those should use the map_resource op. > Does it make sense to try to consolidate our p2p infrastructure so > dma_map_resource() could return bus addresses where appropriate? For some users but not entirely :( The sg path for P2P relies on storing information inside the scatterlist so unmap knows what to do. Changing map_resource to return a similar flag and then having drivers somehow store that flag and give it back to unmap is not a trivial change. It would be a good API for simple drivers, and I think we could build such a helper calling through the new flow. But places like DMABUF that have more complex lists will not like it. For them we've been following the approach of BIO where the driver/subystem will maintain a mapping list and be aware of when the P2P information is changing. Then it has to do different map/unmap sequences based on its own existing tracking. I view this as all very low level infrastructure, I'm really hoping we can get an agreement with Chritain and build a scatterlist replacement for DMABUF that encapsulates all this away from drivers like BIO does for block. But we can't start that until we have a DMA API working fully for non-struct page P2P memory. That is being driven by this series and the VFIO DMABUF implementation on top of it. > Are we trying to remove struct page from the kernel altogether? Yes, it is a very long term project being pushed along with the folios, memdesc conversion and so forth. It is huge, with many aspects, but we can start to reasonably work on parts of them independently. A mid-term dream is to be able to go from pin_user_pages() -> DMA without drivers needing to touch struct page at all. This is a huge project on its own, and we are progressing it slowly "bottom up" by allowing phys_addr_t in the DMA API then we can build more infrastructure for subsystems to be struct-page free, culminating in some pin_user_phyr() and phys_addr_t bio_vec someday. Certainly a big part of this series is influenced by requirements to advance pin_user_pages() -> DMA, while the other part is about allowing P2P to work using phys_addr_t without struct page. Jason
On 25.06.2025 15:18, Leon Romanovsky wrote: > This series refactors the DMA mapping to use physical addresses > as the primary interface instead of page+offset parameters. This > change aligns the DMA API with the underlying hardware reality where > DMA operations work with physical addresses, not page structures. > > The series consists of 8 patches that progressively convert the DMA > mapping infrastructure from page-based to physical address-based APIs: > > The series maintains backward compatibility by keeping the old > page-based API as wrapper functions around the new physical > address-based implementations. Thanks for this rework! I assume that the next step is to add map_phys callback also to the dma_map_ops and teach various dma-mapping providers to use it to avoid more phys-to-page-to-phys conversions. I only wonder if this newly introduced dma_map_phys()/dma_unmap_phys() API is also suitable for the recently discussed PCI P2P DMA? While adding a new API maybe we should take this into account? My main concern is the lack of the source phys addr passed to the dma_unmap_phys() function and I'm aware that this might complicate a bit code conversion from old dma_map/unmap_page() API. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On Fri, Jun 27, 2025 at 03:44:10PM +0200, Marek Szyprowski wrote: > On 25.06.2025 15:18, Leon Romanovsky wrote: > > This series refactors the DMA mapping to use physical addresses > > as the primary interface instead of page+offset parameters. This > > change aligns the DMA API with the underlying hardware reality where > > DMA operations work with physical addresses, not page structures. > > > > The series consists of 8 patches that progressively convert the DMA > > mapping infrastructure from page-based to physical address-based APIs: > > > > The series maintains backward compatibility by keeping the old > > page-based API as wrapper functions around the new physical > > address-based implementations. > > Thanks for this rework! I assume that the next step is to add map_phys > callback also to the dma_map_ops and teach various dma-mapping providers > to use it to avoid more phys-to-page-to-phys conversions. Probably Christoph will say yes, however I personally don't see any benefit in this. Maybe I wrong here, but all existing .map_page() implementation platforms don't support p2p anyway. They won't benefit from this such conversion. > > I only wonder if this newly introduced dma_map_phys()/dma_unmap_phys() > API is also suitable for the recently discussed PCI P2P DMA? While > adding a new API maybe we should take this into account? First, immediate user (not related to p2p) is blk layer: https://lore.kernel.org/linux-nvme/bcdcb5eb-17ed-412f-bf5c-303079798fe2@nvidia.com/T/#m7e715697d4b2e3997622a3400243477c75cab406 +static bool blk_dma_map_direct(struct request *req, struct device *dma_dev, + struct blk_dma_iter *iter, struct phys_vec *vec) +{ + iter->addr = dma_map_page(dma_dev, phys_to_page(vec->paddr), + offset_in_page(vec->paddr), vec->len, rq_dma_dir(req)); + if (dma_mapping_error(dma_dev, iter->addr)) { + iter->status = BLK_STS_RESOURCE; + return false; + } + iter->len = vec->len; + return true; +} Block layer started to store phys addresses instead of struct pages and this phys_to_page() conversion in data-path will be avoided. > My main concern is the lack of the source phys addr passed to the dma_unmap_phys() > function and I'm aware that this might complicate a bit code conversion > from old dma_map/unmap_page() API. > > Best regards > -- > Marek Szyprowski, PhD > Samsung R&D Institute Poland > >
On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: > On Fri, Jun 27, 2025 at 03:44:10PM +0200, Marek Szyprowski wrote: > > On 25.06.2025 15:18, Leon Romanovsky wrote: > > > This series refactors the DMA mapping to use physical addresses > > > as the primary interface instead of page+offset parameters. This > > > change aligns the DMA API with the underlying hardware reality where > > > DMA operations work with physical addresses, not page structures. > > > > > > The series consists of 8 patches that progressively convert the DMA > > > mapping infrastructure from page-based to physical address-based APIs: > > > > > > The series maintains backward compatibility by keeping the old > > > page-based API as wrapper functions around the new physical > > > address-based implementations. > > > > Thanks for this rework! I assume that the next step is to add map_phys > > callback also to the dma_map_ops and teach various dma-mapping providers > > to use it to avoid more phys-to-page-to-phys conversions. > > Probably Christoph will say yes, however I personally don't see any > benefit in this. Maybe I wrong here, but all existing .map_page() > implementation platforms don't support p2p anyway. They won't benefit > from this such conversion. > > > > > I only wonder if this newly introduced dma_map_phys()/dma_unmap_phys() > > API is also suitable for the recently discussed PCI P2P DMA? While > > adding a new API maybe we should take this into account? > > First, immediate user (not related to p2p) is blk layer: > https://lore.kernel.org/linux-nvme/bcdcb5eb-17ed-412f-bf5c-303079798fe2@nvidia.com/T/#m7e715697d4b2e3997622a3400243477c75cab406 > > +static bool blk_dma_map_direct(struct request *req, struct device *dma_dev, > + struct blk_dma_iter *iter, struct phys_vec *vec) > +{ > + iter->addr = dma_map_page(dma_dev, phys_to_page(vec->paddr), > + offset_in_page(vec->paddr), vec->len, rq_dma_dir(req)); > + if (dma_mapping_error(dma_dev, iter->addr)) { > + iter->status = BLK_STS_RESOURCE; > + return false; > + } > + iter->len = vec->len; > + return true; > +} > > Block layer started to store phys addresses instead of struct pages and > this phys_to_page() conversion in data-path will be avoided. I almost completed main user of this dma_map_phys() callback. It is rewrite of this patch [PATCH v3 3/3] vfio/pci: Allow MMIO regions to be exported through dma-buf https://lore.kernel.org/all/20250307052248.405803-4-vivek.kasireddy@intel.com/ Whole populate_sgt()->dma_map_resource() block looks differently now and it is relying on dma_map_phys() as we are exporting memory without struct pages. It will be something like this: 89 for (i = 0; i < priv->nr_ranges; i++) { 90 phys = pci_resource_start(priv->vdev->pdev, 91 dma_ranges[i].region_index); 92 phys += dma_ranges[i].offset; 93 94 if (priv->bus_addr) { 95 addr = pci_p2pdma_bus_addr_map(&p2pdma_state, phys); 96 fill_sg_entry(sgl, dma_ranges[i].length, addr); 97 sgl = sg_next(sgl); 98 } else if (dma_use_iova(&priv->state)) { 99 ret = dma_iova_link(attachment->dev, &priv->state, phys, 100 priv->mapped_len, 101 dma_ranges[i].length, dir, attrs); 102 if (ret) 103 goto err_unmap_dma; 104 105 priv->mapped_len += dma_ranges[i].length; 106 } else { 107 addr = dma_map_phys(attachment->dev, phys, 0, 108 dma_ranges[i].length, dir, attrs); 109 ret = dma_mapping_error(attachment->dev, addr); 110 if (ret) 111 goto unmap_dma_buf; 112 113 fill_sg_entry(sgl, dma_ranges[i].length, addr); 114 sgl = sg_next(sgl); 115 } 116 } 117 118 if (dma_use_iova(&priv->state) && !priv->bus_addr) { 119 ret = dma_iova_sync(attachment->dev, &pri->state, 0, 120 priv->mapped_len); 121 if (ret) 122 goto err_unmap_dma; 123 124 fill_sg_entry(sgl, priv->mapped_len, priv->state.addr); 125 } > > > My main concern is the lack of the source phys addr passed to the dma_unmap_phys() > > function and I'm aware that this might complicate a bit code conversion > > from old dma_map/unmap_page() API. It is not needed for now, all p2p logic is external to DMA API. Thanks > > > > Best regards > > -- > > Marek Szyprowski, PhD > > Samsung R&D Institute Poland > > > > >
On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: > > Thanks for this rework! I assume that the next step is to add map_phys > > callback also to the dma_map_ops and teach various dma-mapping providers > > to use it to avoid more phys-to-page-to-phys conversions. > > Probably Christoph will say yes, however I personally don't see any > benefit in this. Maybe I wrong here, but all existing .map_page() > implementation platforms don't support p2p anyway. They won't benefit > from this such conversion. I think that conversion should eventually happen, and rather sooner than later.
On 30.06.2025 15:38, Christoph Hellwig wrote: > On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: >>> Thanks for this rework! I assume that the next step is to add map_phys >>> callback also to the dma_map_ops and teach various dma-mapping providers >>> to use it to avoid more phys-to-page-to-phys conversions. >> Probably Christoph will say yes, however I personally don't see any >> benefit in this. Maybe I wrong here, but all existing .map_page() >> implementation platforms don't support p2p anyway. They won't benefit >> from this such conversion. > I think that conversion should eventually happen, and rather sooner than > later. Agreed. Applied patches 1-7 to my dma-mapping-next branch. Let me know if one needs a stable branch with it. Leon, it would be great if You could also prepare an incremental patch adding map_phys callback to the dma_maps_ops, so the individual arch-specific dma-mapping providers can be then converted (or simplified in many cases) too. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On 2025-07-08 11:27 am, Marek Szyprowski wrote: > On 30.06.2025 15:38, Christoph Hellwig wrote: >> On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: >>>> Thanks for this rework! I assume that the next step is to add map_phys >>>> callback also to the dma_map_ops and teach various dma-mapping providers >>>> to use it to avoid more phys-to-page-to-phys conversions. >>> Probably Christoph will say yes, however I personally don't see any >>> benefit in this. Maybe I wrong here, but all existing .map_page() >>> implementation platforms don't support p2p anyway. They won't benefit >>> from this such conversion. >> I think that conversion should eventually happen, and rather sooner than >> later. > > Agreed. > > Applied patches 1-7 to my dma-mapping-next branch. Let me know if one > needs a stable branch with it. As the maintainer of iommu-dma, please drop the iommu-dma patch because it is broken. It does not in any way remove the struct page dependency from iommu-dma, it merely hides it so things can crash more easily in circumstances that clearly nobody's bothered to test. > Leon, it would be great if You could also prepare an incremental patch > adding map_phys callback to the dma_maps_ops, so the individual > arch-specific dma-mapping providers can be then converted (or simplified > in many cases) too. Marek, I'm surprised that even you aren't seeing why that would at best be pointless churn. The fundamental design of dma_map_page() operating on struct page is that it sits in between alloc_pages() at the caller and kmap_atomic() deep down in the DMA API implementation (which also subsumes any dependencies on having a kernel virtual address at the implementation end). The natural working unit for whatever replaces dma_map_page() will be whatever the replacement for alloc_pages() returns, and the replacement for kmap_atomic() operates on. Until that exists (and I simply cannot believe it would be an unadorned physical address) there cannot be any *meaningful* progress made towards removing the struct page dependency from the DMA API. If there is also a goal to kill off highmem before then, then logically we should just wait for that to land, then revert back to dma_map_single() being the first-class interface, and dma_map_page() can turn into a trivial page_to_virt() wrapper for the long tail of caller conversions. Simply obfuscating the struct page dependency today by dressing it up as a phys_addr_t with implicit baggage is not not in any way helpful. It only makes the code harder to understand and more bug-prone. Despite the disingenuous claims, it is quite blatantly the opposite of "efficient" for callers to do extra work to throw away useful information with page_to_phys(), and the implementation then have to re-derive that information with pfn_valid()/phys_to_page(). And by "bug-prone" I also include greater distractions like this misguided idea that the same API could somehow work for non-memory addresses too, so then everyone can move on bikeshedding VFIO while overlooking the fundamental flaws in the whole premise. I mean, besides all the issues I've already pointed out in that regard, not least the glaring fact that it's literally just a worse version of *an API we already have*, as DMA API maintainer do you *really* approve of a design that depends on callers abusing DMA_ATTR_SKIP_CPU_SYNC, yet will still readily blow up if they did then call a dma_sync op? Thanks, Robin.
Hi Robin, I don't know the DMA mapping code well and haven't reviewed this patch set in particular, but I wanted to comment on some of the things you say here. > Marek, I'm surprised that even you aren't seeing why that would at best be > pointless churn. The fundamental design of dma_map_page() operating on > struct page is that it sits in between alloc_pages() at the caller and > kmap_atomic() deep down in the DMA API implementation (which also subsumes > any dependencies on having a kernel virtual address at the implementation > end). The natural working unit for whatever replaces dma_map_page() will be > whatever the replacement for alloc_pages() returns, and the replacement for > kmap_atomic() operates on. Until that exists (and I simply cannot believe it > would be an unadorned physical address) there cannot be any *meaningful* > progress made towards removing the struct page dependency from the DMA API. > If there is also a goal to kill off highmem before then, then logically we > should just wait for that to land, then revert back to dma_map_single() > being the first-class interface, and dma_map_page() can turn into a trivial > page_to_virt() wrapper for the long tail of caller conversions. While I'm sure we'd all love to kill off highmem, that's not a realistic goal for another ten years or so. There are meaningful improvements we can make, for example pulling page tables out of highmem, but we need to keep file data and anonymous memory in highmem, so we'll need to support DMA to highmem for the foreseeable future. The replacement for kmap_atomic() is already here -- it's kmap_(atomic|local)_pfn(). If a simple wrapper like kmap_local_phys() would make this more palatable, that would be fine by me. Might save a bit of messing around with calculating offsets in each caller. As far as replacing alloc_pages() goes, some callers will still use alloc_pages(). Others will use folio_alloc() or have used kmalloc(). Or maybe the caller won't have used any kind of page allocation because they're doing I/O to something that isn't part of Linux's memory at all. Part of the Grand Plan here is for Linux to catch up with Xen's ability to do I/O to guests without allocating struct pages for every page of memory in the guests. You say that a physical address will need some adornment -- can you elaborate on that for me? It may be that I'm missing something important here.
On Thu, Jul 31, 2025 at 06:37:11PM +0100, Matthew Wilcox wrote: > The replacement for kmap_atomic() is already here -- it's > kmap_(atomic|local)_pfn(). If a simple wrapper like kmap_local_phys() > would make this more palatable, that would be fine by me. Might save > a bit of messing around with calculating offsets in each caller. I think that makes the general plan clearer. We should be removing the struct pages entirely from the insides of DMA API layer and use the phys_addr_t, kmap_XX_phys(), phys_to_virt(), and so on. The request from Christoph and Marek to clean up the dma_ops makes sense in that context, we'd have to go into the ops and replace the struct page kmaps/etc with the phys based ones. This hides the struct page requirement to get to a KVA inside the core mm code only and that sort of modularity is exactly the sort of thing that could help entirely remove a struct page requirement for some kinds of DMA someday. Matthew, do you think it makes sense to introduce types to make this clearer? We have two kinds of values that a phys_addr_t can store - something compatible with kmap_XX_phys(), and something that isn't. This was recently a long discussion in ARM KVM as well which had a similar confusion that a phys_addr_t was actually two very different things inside its logic. So what about some dedicated types: kphys_addr_t - A physical address that can be passed to kmap_XX_phys(), phys_to_virt(), etc. raw_phys_addr_t - A physical address that may not be cachable, may not be DRAM, and does not work with kmap_XX_phys()/etc. We clearly have these two different ideas floating around in code, page tables, etc. I read some of Robin's concern that the struct page provided a certain amount of type safety in the DMA API, this could provide similar. Thanks, Jason
On Sun, Aug 03, 2025 at 12:59:06PM -0300, Jason Gunthorpe wrote: > Matthew, do you think it makes sense to introduce types to make this > clearer? We have two kinds of values that a phys_addr_t can store - > something compatible with kmap_XX_phys(), and something that isn't. I was with you up until this point. And then you said "What if we have a raccoon that isn't a raccoon" and my brain derailed. > This was recently a long discussion in ARM KVM as well which had a > similar confusion that a phys_addr_t was actually two very different > things inside its logic. No. A phys_addr_t is a phys_addr_t. If something's abusing a phys_addr_t to store something entirely different then THAT is what should be using a different type. We've defined what a phys_addr_t is. That was in Documentation/core-api/bus-virt-phys-mapping.rst before Arnd removed it; to excerpt the relevant bit: --- - CPU untranslated. This is the "physical" address. Physical address 0 is what the CPU sees when it drives zeroes on the memory bus. [...] So why do we care about the physical address at all? We do need the physical address in some cases, it's just not very often in normal code. The physical address is needed if you use memory mappings, for example, because the "remap_pfn_range()" mm function wants the physical address of the memory to be remapped as measured in units of pages, a.k.a. the pfn. --- So if somebody is stuffing something else into phys_addr_t, *THAT* is what needs to be fixed, not adding a new sub-type of phys_addr_t for things which are actually phys_addr_t. > We clearly have these two different ideas floating around in code, > page tables, etc. No. No, we don't. I've never heard of this asininity before.
On Mon, Aug 04, 2025 at 04:37:56AM +0100, Matthew Wilcox wrote: > On Sun, Aug 03, 2025 at 12:59:06PM -0300, Jason Gunthorpe wrote: > > Matthew, do you think it makes sense to introduce types to make this > > clearer? We have two kinds of values that a phys_addr_t can store - > > something compatible with kmap_XX_phys(), and something that isn't. > > I was with you up until this point. And then you said "What if we have > a raccoon that isn't a raccoon" and my brain derailed. I though it was clear.. kmap_local_pfn(phys >> PAGE_SHIFT) phys_to_virt(phys) Does not work for all values of phys. It definately illegal for non-cachable MMIO. Agree? There is a subset of phys that is cachable and has struct page that is usable with kmap_local_pfn()/etc phys is always this: > - CPU untranslated. This is the "physical" address. Physical address > 0 is what the CPU sees when it drives zeroes on the memory bus. But that is a pure HW perspective. It doesn't say which of our SW APIs are allowed to use this address. We have callchains in DMA API land that want to do a kmap at the bottom. It would be nice to mark the whole call chain that the phys_addr being passed around is actually required to be kmappable. Because if you pass a non-kmappable MMIO backed phys it will explode in some way on some platforms. > > We clearly have these two different ideas floating around in code, > > page tables, etc. > No. No, we don't. I've never heard of this asininity before. Welcome to the fun world of cachable and non-cachable memory. Consider, today we can create struct pages of type MEMORY_DEVICE_PCI_P2PDMA for non-cachable MMIO. I think today you "can" use kmap to establish a cachable mapping in the vmap. But it is *illegal* to establish a cachable CPU mapping of MMIO. Archs are free to MCE if you do this - speculative cache line load of MMIO can just error in HW inside the interconnect. So, the phys_addr is always a "CPU untranslated physical address" but the cachable/non-cachable cases, or DRAM vs MMIO, are sometimes semantically very different things for the SW! Jason
On 30.07.2025 13:11, Robin Murphy wrote: > On 2025-07-08 11:27 am, Marek Szyprowski wrote: >> On 30.06.2025 15:38, Christoph Hellwig wrote: >>> On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: >>>>> Thanks for this rework! I assume that the next step is to add >>>>> map_phys >>>>> callback also to the dma_map_ops and teach various dma-mapping >>>>> providers >>>>> to use it to avoid more phys-to-page-to-phys conversions. >>>> Probably Christoph will say yes, however I personally don't see any >>>> benefit in this. Maybe I wrong here, but all existing .map_page() >>>> implementation platforms don't support p2p anyway. They won't benefit >>>> from this such conversion. >>> I think that conversion should eventually happen, and rather sooner >>> than >>> later. >> >> Agreed. >> >> Applied patches 1-7 to my dma-mapping-next branch. Let me know if one >> needs a stable branch with it. > > As the maintainer of iommu-dma, please drop the iommu-dma patch > because it is broken. It does not in any way remove the struct page > dependency from iommu-dma, it merely hides it so things can crash more > easily in circumstances that clearly nobody's bothered to test. > >> Leon, it would be great if You could also prepare an incremental patch >> adding map_phys callback to the dma_maps_ops, so the individual >> arch-specific dma-mapping providers can be then converted (or simplified >> in many cases) too. > > Marek, I'm surprised that even you aren't seeing why that would at > best be pointless churn. The fundamental design of dma_map_page() > operating on struct page is that it sits in between alloc_pages() at > the caller and kmap_atomic() deep down in the DMA API implementation > (which also subsumes any dependencies on having a kernel virtual > address at the implementation end). The natural working unit for > whatever replaces dma_map_page() will be whatever the replacement for > alloc_pages() returns, and the replacement for kmap_atomic() operates > on. Until that exists (and I simply cannot believe it would be an > unadorned physical address) there cannot be any *meaningful* progress > made towards removing the struct page dependency from the DMA API. If > there is also a goal to kill off highmem before then, then logically > we should just wait for that to land, then revert back to > dma_map_single() being the first-class interface, and dma_map_page() > can turn into a trivial page_to_virt() wrapper for the long tail of > caller conversions. > > Simply obfuscating the struct page dependency today by dressing it up > as a phys_addr_t with implicit baggage is not not in any way helpful. > It only makes the code harder to understand and more bug-prone. > Despite the disingenuous claims, it is quite blatantly the opposite of > "efficient" for callers to do extra work to throw away useful > information with page_to_phys(), and the implementation then have to > re-derive that information with pfn_valid()/phys_to_page(). > > And by "bug-prone" I also include greater distractions like this > misguided idea that the same API could somehow work for non-memory > addresses too, so then everyone can move on bikeshedding VFIO while > overlooking the fundamental flaws in the whole premise. I mean, > besides all the issues I've already pointed out in that regard, not > least the glaring fact that it's literally just a worse version of *an > API we already have*, as DMA API maintainer do you *really* approve of > a design that depends on callers abusing DMA_ATTR_SKIP_CPU_SYNC, yet > will still readily blow up if they did then call a dma_sync op? > Robin, Your concerns are right. I missed the fact that making everything depend on phys_addr_t would make DMA-mapping API prone for various abuses. I need to think a bit more on this and try to understand more the PCI P2P case, what means that I will probably miss this merge window. I'm sorry for the lack of being active in the discussion, but I just got back from my holidays and I'm trying to catch up. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On Wed, Jul 30, 2025 at 12:11:32PM +0100, Robin Murphy wrote: > On 2025-07-08 11:27 am, Marek Szyprowski wrote: > > On 30.06.2025 15:38, Christoph Hellwig wrote: > > > On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: > > > > > Thanks for this rework! I assume that the next step is to add map_phys > > > > > callback also to the dma_map_ops and teach various dma-mapping providers > > > > > to use it to avoid more phys-to-page-to-phys conversions. > > > > Probably Christoph will say yes, however I personally don't see any > > > > benefit in this. Maybe I wrong here, but all existing .map_page() > > > > implementation platforms don't support p2p anyway. They won't benefit > > > > from this such conversion. > > > I think that conversion should eventually happen, and rather sooner than > > > later. > > > > Agreed. > > > > Applied patches 1-7 to my dma-mapping-next branch. Let me know if one > > needs a stable branch with it. > > As the maintainer of iommu-dma, please drop the iommu-dma patch because it > is broken. It does not in any way remove the struct page dependency from > iommu-dma, it merely hides it so things can crash more easily in > circumstances that clearly nobody's bothered to test. > > > Leon, it would be great if You could also prepare an incremental patch > > adding map_phys callback to the dma_maps_ops, so the individual > > arch-specific dma-mapping providers can be then converted (or simplified > > in many cases) too. > > Marek, I'm surprised that even you aren't seeing why that would at best be > pointless churn. The fundamental design of dma_map_page() operating on > struct page is that it sits in between alloc_pages() at the caller and > kmap_atomic() deep down in the DMA API implementation (which also subsumes > any dependencies on having a kernel virtual address at the implementation > end). The natural working unit for whatever replaces dma_map_page() will be > whatever the replacement for alloc_pages() returns, and the replacement for > kmap_atomic() operates on. Until that exists (and I simply cannot believe it > would be an unadorned physical address) there cannot be any *meaningful* > progress made towards removing the struct page dependency from the DMA API. > If there is also a goal to kill off highmem before then, then logically we > should just wait for that to land, then revert back to dma_map_single() > being the first-class interface, and dma_map_page() can turn into a trivial > page_to_virt() wrapper for the long tail of caller conversions. > > Simply obfuscating the struct page dependency today by dressing it up as a > phys_addr_t with implicit baggage is not not in any way helpful. It only > makes the code harder to understand and more bug-prone. Despite the > disingenuous claims, it is quite blatantly the opposite of "efficient" for > callers to do extra work to throw away useful information with > page_to_phys(), and the implementation then have to re-derive that > information with pfn_valid()/phys_to_page(). > > And by "bug-prone" I also include greater distractions like this misguided > idea that the same API could somehow work for non-memory addresses too, so > then everyone can move on bikeshedding VFIO while overlooking the > fundamental flaws in the whole premise. I mean, besides all the issues I've > already pointed out in that regard, not least the glaring fact that it's > literally just a worse version of *an API we already have*, as DMA API > maintainer do you *really* approve of a design that depends on callers > abusing DMA_ATTR_SKIP_CPU_SYNC, yet will still readily blow up if they did > then call a dma_sync op? Robin, Marek I would like to ask you to do not drop this series and allow me to gradually change the code during my VFIO DMABUF adventure. The most reasonable way to prevent DMA_ATTR_SKIP_CPU_SYNC leakage is to introduce new DMA attribute (let's call it DMA_ATTR_MMIO for now) and pass it to both dma_map_phys() and dma_iova_link(). This flag will indicate that p2p type is PCI_P2PDMA_MAP_THRU_HOST_BRIDGE and call to right callbacks which will set IOMMU_MMIO flag and skip CPU sync, dma_map_phys() isn't entirely wrong, it just needs an extra tweaks. Thanks > > Thanks, > Robin. >
On Wed, Jul 30, 2025 at 04:40:26PM +0300, Leon Romanovsky wrote: > > The natural working unit for whatever replaces dma_map_page() will be > > whatever the replacement for alloc_pages() returns, and the replacement for > > kmap_atomic() operates on. Until that exists (and I simply cannot believe it > > would be an unadorned physical address) there cannot be any > > *meaningful* alloc_pages becomes legacy. There will be some new API 'memdesc alloc'. If I understand Matthew's plan properly - here is a sketch of changing iommu-pages: --- a/drivers/iommu/iommu-pages.c +++ b/drivers/iommu/iommu-pages.c @@ -36,9 +36,10 @@ static_assert(sizeof(struct ioptdesc) <= sizeof(struct page)); */ void *iommu_alloc_pages_node_sz(int nid, gfp_t gfp, size_t size) { + struct ioptdesc *desc; unsigned long pgcnt; - struct folio *folio; unsigned int order; + void *addr; /* This uses page_address() on the memory. */ if (WARN_ON(gfp & __GFP_HIGHMEM)) @@ -56,8 +57,8 @@ void *iommu_alloc_pages_node_sz(int nid, gfp_t gfp, size_t size) if (nid == NUMA_NO_NODE) nid = numa_mem_id(); - folio = __folio_alloc_node(gfp | __GFP_ZERO, order, nid); - if (unlikely(!folio)) + addr = memdesc_alloc_pages(&desc, gfp | __GFP_ZERO, order, nid); + if (unlikely(!addr)) return NULL; /* @@ -73,7 +74,7 @@ void *iommu_alloc_pages_node_sz(int nid, gfp_t gfp, size_t size) mod_node_page_state(folio_pgdat(folio), NR_IOMMU_PAGES, pgcnt); lruvec_stat_mod_folio(folio, NR_SECONDARY_PAGETABLE, pgcnt); - return folio_address(folio); + return addr; } Where the memdesc_alloc_pages() will kmalloc a 'struct ioptdesc' and some other change so that virt_to_ioptdesc() indirects through a new memdesc. See here: https://kernelnewbies.org/MatthewWilcox/Memdescs We don't end up with some kind of catch-all struct to mean 'cachable CPU memory' anymore because every user gets their own unique "struct XXXdesc". So the thinking has been that the phys_addr_t is the best option. I guess the alternative would be the memdesc as a handle, but I'm not sure that is such a good idea. People still express a desire to be able to do IO to cachable memory that has a KVA through phys_to_virt but no memdesc/page allocation. I don't know if this will happen but it doesn't seem like a good idea to make it impossible by forcing memdesc types into low level APIs that don't use them. Also, the bio/scatterlist code between pin_user_pages() and DMA mapping is consolidating physical contiguity. This runs faster if you don't have to to page_to_phys() because everything is already phys_addr_t. > > progress made towards removing the struct page dependency from the DMA API. > > If there is also a goal to kill off highmem before then, then logically we > > should just wait for that to land, then revert back to dma_map_single() > > being the first-class interface, and dma_map_page() can turn into a trivial > > page_to_virt() wrapper for the long tail of caller conversions. As I said there are many many projects related here and we can meaningfully make progress in parts. It is not functionally harmful to do the phys to page conversion before calling the legacy dma_ops/SWIOTLB etc. This avoids creating patch dependencies with highmem removal and other projects. So long as the legacy things (highmem, dma_ops, etc) continue to work I think it is OK to accept some obfuscation to allow the modern things to work better. The majority flow - no highmem, no dma ops, no swiotlb, does not require struct page. Having to do PTE -> phys -> page -> phys -> DMA Does have a cost. > The most reasonable way to prevent DMA_ATTR_SKIP_CPU_SYNC leakage is to > introduce new DMA attribute (let's call it DMA_ATTR_MMIO for now) and > pass it to both dma_map_phys() and dma_iova_link(). This flag will > indicate that p2p type is PCI_P2PDMA_MAP_THRU_HOST_BRIDGE and call to > right callbacks which will set IOMMU_MMIO flag and skip CPU sync, So the idea is if the memory is non-cachable, no-KVA you'd call dma_iova_link(phys_addr, DMA_ATTR_MMIO) and dma_map_phys(phys_addr, DMA_ATTR_MMIO) ? And then internally the dma_ops and dma_iommu would use the existing map_page/map_resource variations based on the flag, thus ensuring that MMIO is never kmap'd or cache flushed? dma_map_resource is really then just dma_map_phys(phys_addr, DMA_ATTR_MMIO)? I like this, I think it well addresses the concerns. Jason
On Wed, Jul 30, 2025 at 11:28:18AM -0300, Jason Gunthorpe wrote: > On Wed, Jul 30, 2025 at 04:40:26PM +0300, Leon Romanovsky wrote: <...> > > The most reasonable way to prevent DMA_ATTR_SKIP_CPU_SYNC leakage is to > > introduce new DMA attribute (let's call it DMA_ATTR_MMIO for now) and > > pass it to both dma_map_phys() and dma_iova_link(). This flag will > > indicate that p2p type is PCI_P2PDMA_MAP_THRU_HOST_BRIDGE and call to > > right callbacks which will set IOMMU_MMIO flag and skip CPU sync, > > So the idea is if the memory is non-cachable, no-KVA you'd call > dma_iova_link(phys_addr, DMA_ATTR_MMIO) and dma_map_phys(phys_addr, > DMA_ATTR_MMIO) ? Yes > > And then internally the dma_ops and dma_iommu would use the existing > map_page/map_resource variations based on the flag, thus ensuring that > MMIO is never kmap'd or cache flushed? > > dma_map_resource is really then just > dma_map_phys(phys_addr, DMA_ATTR_MMIO)? > > I like this, I think it well addresses the concerns. Yes, I had this idea and implementation before. :( > > Jason >
On Tue, Jul 08, 2025 at 12:27:09PM +0200, Marek Szyprowski wrote: > On 30.06.2025 15:38, Christoph Hellwig wrote: > > On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: > >>> Thanks for this rework! I assume that the next step is to add map_phys > >>> callback also to the dma_map_ops and teach various dma-mapping providers > >>> to use it to avoid more phys-to-page-to-phys conversions. > >> Probably Christoph will say yes, however I personally don't see any > >> benefit in this. Maybe I wrong here, but all existing .map_page() > >> implementation platforms don't support p2p anyway. They won't benefit > >> from this such conversion. > > I think that conversion should eventually happen, and rather sooner than > > later. > > Agreed. > > Applied patches 1-7 to my dma-mapping-next branch. Let me know if one > needs a stable branch with it. Thanks a lot, I don't think that stable branch is needed. Realistically speaking, my VFIO DMA work won't be merged this cycle, We are in -rc5, it is complete rewrite from RFC version and touches pci-p2p code (to remove dependency on struct page) in addition to VFIO, so it will take time. Regarding, last patch (hmm), it will be great if you can take it. We didn't touch anything in hmm.c this cycle and have no plans to send PR. It can safely go through your tree. > > Leon, it would be great if You could also prepare an incremental patch > adding map_phys callback to the dma_maps_ops, so the individual > arch-specific dma-mapping providers can be then converted (or simplified > in many cases) too. Sure, will do. > > Best regards > -- > Marek Szyprowski, PhD > Samsung R&D Institute Poland >
On 08.07.2025 13:00, Leon Romanovsky wrote: > On Tue, Jul 08, 2025 at 12:27:09PM +0200, Marek Szyprowski wrote: >> On 30.06.2025 15:38, Christoph Hellwig wrote: >>> On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: >>>>> Thanks for this rework! I assume that the next step is to add map_phys >>>>> callback also to the dma_map_ops and teach various dma-mapping providers >>>>> to use it to avoid more phys-to-page-to-phys conversions. >>>> Probably Christoph will say yes, however I personally don't see any >>>> benefit in this. Maybe I wrong here, but all existing .map_page() >>>> implementation platforms don't support p2p anyway. They won't benefit >>>> from this such conversion. >>> I think that conversion should eventually happen, and rather sooner than >>> later. >> Agreed. >> >> Applied patches 1-7 to my dma-mapping-next branch. Let me know if one >> needs a stable branch with it. > Thanks a lot, I don't think that stable branch is needed. Realistically > speaking, my VFIO DMA work won't be merged this cycle, We are in -rc5, > it is complete rewrite from RFC version and touches pci-p2p code (to > remove dependency on struct page) in addition to VFIO, so it will take > time. > > Regarding, last patch (hmm), it will be great if you can take it. > We didn't touch anything in hmm.c this cycle and have no plans to send PR. > It can safely go through your tree. Okay, then I would like to get an explicit ack from Jérôme for this. >> Leon, it would be great if You could also prepare an incremental patch >> adding map_phys callback to the dma_maps_ops, so the individual >> arch-specific dma-mapping providers can be then converted (or simplified >> in many cases) too. > Sure, will do. Thanks! Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On Tue, Jul 08, 2025 at 01:45:20PM +0200, Marek Szyprowski wrote: > On 08.07.2025 13:00, Leon Romanovsky wrote: > > On Tue, Jul 08, 2025 at 12:27:09PM +0200, Marek Szyprowski wrote: > >> On 30.06.2025 15:38, Christoph Hellwig wrote: > >>> On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: > >>>>> Thanks for this rework! I assume that the next step is to add map_phys > >>>>> callback also to the dma_map_ops and teach various dma-mapping providers > >>>>> to use it to avoid more phys-to-page-to-phys conversions. > >>>> Probably Christoph will say yes, however I personally don't see any > >>>> benefit in this. Maybe I wrong here, but all existing .map_page() > >>>> implementation platforms don't support p2p anyway. They won't benefit > >>>> from this such conversion. > >>> I think that conversion should eventually happen, and rather sooner than > >>> later. > >> Agreed. > >> > >> Applied patches 1-7 to my dma-mapping-next branch. Let me know if one > >> needs a stable branch with it. > > Thanks a lot, I don't think that stable branch is needed. Realistically > > speaking, my VFIO DMA work won't be merged this cycle, We are in -rc5, > > it is complete rewrite from RFC version and touches pci-p2p code (to > > remove dependency on struct page) in addition to VFIO, so it will take > > time. > > > > Regarding, last patch (hmm), it will be great if you can take it. > > We didn't touch anything in hmm.c this cycle and have no plans to send PR. > > It can safely go through your tree. > > Okay, then I would like to get an explicit ack from Jérôme for this. Jerome is not active in HMM world for a long time already. HMM tree is managed by us (RDMA) https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=hmm ➜ kernel git:(m/dmabuf-vfio) git log --merges mm/hmm.c ... Pull HMM updates from Jason Gunthorpe: ... https://web.git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=58ba80c4740212c29a1cf9b48f588e60a7612209 +hmm git git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git#hmm We just never bothered to reflect current situation in MAINTAINERS file. Thanks
On 08.07.2025 14:06, Leon Romanovsky wrote: > On Tue, Jul 08, 2025 at 01:45:20PM +0200, Marek Szyprowski wrote: >> On 08.07.2025 13:00, Leon Romanovsky wrote: >>> On Tue, Jul 08, 2025 at 12:27:09PM +0200, Marek Szyprowski wrote: >>>> On 30.06.2025 15:38, Christoph Hellwig wrote: >>>>> On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: >>>>>>> Thanks for this rework! I assume that the next step is to add map_phys >>>>>>> callback also to the dma_map_ops and teach various dma-mapping providers >>>>>>> to use it to avoid more phys-to-page-to-phys conversions. >>>>>> Probably Christoph will say yes, however I personally don't see any >>>>>> benefit in this. Maybe I wrong here, but all existing .map_page() >>>>>> implementation platforms don't support p2p anyway. They won't benefit >>>>>> from this such conversion. >>>>> I think that conversion should eventually happen, and rather sooner than >>>>> later. >>>> Agreed. >>>> >>>> Applied patches 1-7 to my dma-mapping-next branch. Let me know if one >>>> needs a stable branch with it. >>> Thanks a lot, I don't think that stable branch is needed. Realistically >>> speaking, my VFIO DMA work won't be merged this cycle, We are in -rc5, >>> it is complete rewrite from RFC version and touches pci-p2p code (to >>> remove dependency on struct page) in addition to VFIO, so it will take >>> time. >>> >>> Regarding, last patch (hmm), it will be great if you can take it. >>> We didn't touch anything in hmm.c this cycle and have no plans to send PR. >>> It can safely go through your tree. >> Okay, then I would like to get an explicit ack from Jérôme for this. > Jerome is not active in HMM world for a long time already. > HMM tree is managed by us (RDMA) https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=hmm > ➜ kernel git:(m/dmabuf-vfio) git log --merges mm/hmm.c > ... > Pull HMM updates from Jason Gunthorpe: > ... > > https://web.git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=58ba80c4740212c29a1cf9b48f588e60a7612209 > +hmm git git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git#hmm > > We just never bothered to reflect current situation in MAINTAINERS file. Maybe this is the time to update it :) I was just a bit confused that no-one commented the HMM patch, but if You maintain it, then this is okay. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On 08.07.2025 14:56, Marek Szyprowski wrote: > On 08.07.2025 14:06, Leon Romanovsky wrote: >> On Tue, Jul 08, 2025 at 01:45:20PM +0200, Marek Szyprowski wrote: >>> On 08.07.2025 13:00, Leon Romanovsky wrote: >>>> On Tue, Jul 08, 2025 at 12:27:09PM +0200, Marek Szyprowski wrote: >>>>> On 30.06.2025 15:38, Christoph Hellwig wrote: >>>>>> On Fri, Jun 27, 2025 at 08:02:13PM +0300, Leon Romanovsky wrote: >>>>>>>> Thanks for this rework! I assume that the next step is to add >>>>>>>> map_phys >>>>>>>> callback also to the dma_map_ops and teach various dma-mapping >>>>>>>> providers >>>>>>>> to use it to avoid more phys-to-page-to-phys conversions. >>>>>>> Probably Christoph will say yes, however I personally don't see any >>>>>>> benefit in this. Maybe I wrong here, but all existing .map_page() >>>>>>> implementation platforms don't support p2p anyway. They won't >>>>>>> benefit >>>>>>> from this such conversion. >>>>>> I think that conversion should eventually happen, and rather >>>>>> sooner than >>>>>> later. >>>>> Agreed. >>>>> >>>>> Applied patches 1-7 to my dma-mapping-next branch. Let me know if one >>>>> needs a stable branch with it. >>>> Thanks a lot, I don't think that stable branch is needed. >>>> Realistically >>>> speaking, my VFIO DMA work won't be merged this cycle, We are in -rc5, >>>> it is complete rewrite from RFC version and touches pci-p2p code (to >>>> remove dependency on struct page) in addition to VFIO, so it will take >>>> time. >>>> >>>> Regarding, last patch (hmm), it will be great if you can take it. >>>> We didn't touch anything in hmm.c this cycle and have no plans to >>>> send PR. >>>> It can safely go through your tree. >>> Okay, then I would like to get an explicit ack from Jérôme for this. >> Jerome is not active in HMM world for a long time already. >> HMM tree is managed by us (RDMA) >> https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=hmm >> ➜ kernel git:(m/dmabuf-vfio) git log --merges mm/hmm.c >> ... >> Pull HMM updates from Jason Gunthorpe: >> ... >> >> https://web.git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=58ba80c4740212c29a1cf9b48f588e60a7612209 >> >> +hmm git >> git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git#hmm >> >> We just never bothered to reflect current situation in MAINTAINERS file. > > Maybe this is the time to update it :) > > I was just a bit confused that no-one commented the HMM patch, but if > You maintain it, then this is okay. I've applied the last patch to dma-mapping-for-next branch. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
© 2016 - 2025 Red Hat, Inc.