Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml | 4 ++++ 1 file changed, 4 insertions(+)
On some SoCs without an IOMMU behind the PCIe controller, the PCIe
controller memory access could be limited to a small region by the
firmware configuring a memory protection unit. This memory region
must be assigned to the PCIe controller so that the OS knows to
use that region. Otherwise PCIe devices would not work properly.
Allow the memory-region property with one item pointing to a
restricted DMA buffer.
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
---
This patch compliments another patch that moved the memory-region from
the PCIe device to the PCIe controller [1].
[1] https://lore.kernel.org/all/20260430120725.241779-1-wenst@chromium.org/
Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml
index 4db700fc36ba..4a9e41d01628 100644
--- a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml
+++ b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml
@@ -115,6 +115,10 @@ properties:
power-domains:
maxItems: 1
+ memory-region:
+ maxItems: 1
+ description: phandle to restricted DMA buffer
+
mediatek,pbus-csr:
$ref: /schemas/types.yaml#/definitions/phandle-array
items:
--
2.54.0.563.g4f69b47b94-goog
On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote: > On some SoCs without an IOMMU behind the PCIe controller, the PCIe > controller memory access could be limited to a small region by the > firmware configuring a memory protection unit. This memory region > must be assigned to the PCIe controller so that the OS knows to > use that region. Otherwise PCIe devices would not work properly. > So this means, the PCIe devices can only access a specific carveout memory configured by MPU for DMA? If so, you should use 'dma-ranges' as suggested by Rob. 'memory-region' also serves the purpose, but for PCI, we have the dedicated 'dma-ranges' property. - Mani -- மணிவண்ணன் சதாசிவம்
On Thu, May 14, 2026 at 1:23 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
>
> On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote:
> > On some SoCs without an IOMMU behind the PCIe controller, the PCIe
> > controller memory access could be limited to a small region by the
> > firmware configuring a memory protection unit. This memory region
> > must be assigned to the PCIe controller so that the OS knows to
> > use that region. Otherwise PCIe devices would not work properly.
> >
>
> So this means, the PCIe devices can only access a specific carveout memory
> configured by MPU for DMA? If so, you should use 'dma-ranges' as suggested by
> Rob.
>
> 'memory-region' also serves the purpose, but for PCI, we have the dedicated
> 'dma-ranges' property.
I think I need some sort of guide on writing the 'dma-ranges' property,
because it is not working for me.
I'm adding
dma-ranges = <0x42000000 0 0x00000000 0 0xc0000000 0 0x4000000>;
to the PCIe controller node, and dropping the memory-region. The WiFi
driver subsequently fails to allocate buffers:
rtw88_8822ce 0000:01:00.0: enabling device (0000 -> 0003)
rtw88_8822ce 0000:01:00.0: failed to allocate tx ring
This is dma_alloc_coherent() failing.
rtw88_8822ce 0000:01:00.0: Firmware version 9.9.15, H2C version 15
rtw88_8822ce 0000:01:00.0: failed to allocate pci resources
rtw88_8822ce 0000:01:00.0: WOW Firmware version 9.9.4, H2C version 15
rtw88_8822ce 0000:01:00.0: failed to setup pci resources
rtw88_8822ce 0000:01:00.0: probe with driver rtw88_8822ce failed
with error -12
Also, using memory-region seems more straight-forward: I have a region of
memory dedicated to the PCIe controller. I describe the memory region,
and assign it to the PCIe controller.
Thanks
ChenYu
On Thu, May 14, 2026 at 03:54:29PM +0800, Chen-Yu Tsai wrote: > On Thu, May 14, 2026 at 1:23 PM Manivannan Sadhasivam <mani@kernel.org> wrote: > > > > On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote: > > > On some SoCs without an IOMMU behind the PCIe controller, the PCIe > > > controller memory access could be limited to a small region by the > > > firmware configuring a memory protection unit. This memory region > > > must be assigned to the PCIe controller so that the OS knows to > > > use that region. Otherwise PCIe devices would not work properly. > > > > > > > So this means, the PCIe devices can only access a specific carveout memory > > configured by MPU for DMA? If so, you should use 'dma-ranges' as suggested by > > Rob. > > > > 'memory-region' also serves the purpose, but for PCI, we have the dedicated > > 'dma-ranges' property. > > I think I need some sort of guide on writing the 'dma-ranges' property, > because it is not working for me. > > I'm adding > > dma-ranges = <0x42000000 0 0x00000000 0 0xc0000000 0 0x4000000>; > So the device DMA address start from 0x0? Isn't it a 1:1 mapping? dma-ranges = <0x42000000 0 0xc0000000 0 0xc0000000 0 0x4000000>; - Mani -- மணிவண்ணன் சதாசிவம்
On Thu, May 14, 2026 at 7:48 PM Manivannan Sadhasivam <mani@kernel.org> wrote: > > On Thu, May 14, 2026 at 03:54:29PM +0800, Chen-Yu Tsai wrote: > > On Thu, May 14, 2026 at 1:23 PM Manivannan Sadhasivam <mani@kernel.org> wrote: > > > > > > On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote: > > > > On some SoCs without an IOMMU behind the PCIe controller, the PCIe > > > > controller memory access could be limited to a small region by the > > > > firmware configuring a memory protection unit. This memory region > > > > must be assigned to the PCIe controller so that the OS knows to > > > > use that region. Otherwise PCIe devices would not work properly. > > > > > > > > > > So this means, the PCIe devices can only access a specific carveout memory > > > configured by MPU for DMA? If so, you should use 'dma-ranges' as suggested by > > > Rob. > > > > > > 'memory-region' also serves the purpose, but for PCI, we have the dedicated > > > 'dma-ranges' property. > > > > I think I need some sort of guide on writing the 'dma-ranges' property, > > because it is not working for me. > > > > I'm adding > > > > dma-ranges = <0x42000000 0 0x00000000 0 0xc0000000 0 0x4000000>; > > > > So the device DMA address start from 0x0? Isn't it a 1:1 mapping? I actually don't know. But > dma-ranges = <0x42000000 0 0xc0000000 0 0xc0000000 0 0x4000000>; this didn't work either.
On Fri, May 15, 2026 at 05:16:19PM +0800, Chen-Yu Tsai wrote: > On Thu, May 14, 2026 at 7:48 PM Manivannan Sadhasivam <mani@kernel.org> wrote: > > > > On Thu, May 14, 2026 at 03:54:29PM +0800, Chen-Yu Tsai wrote: > > > On Thu, May 14, 2026 at 1:23 PM Manivannan Sadhasivam <mani@kernel.org> wrote: > > > > > > > > On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote: > > > > > On some SoCs without an IOMMU behind the PCIe controller, the PCIe > > > > > controller memory access could be limited to a small region by the > > > > > firmware configuring a memory protection unit. This memory region > > > > > must be assigned to the PCIe controller so that the OS knows to > > > > > use that region. Otherwise PCIe devices would not work properly. > > > > > > > > > > > > > So this means, the PCIe devices can only access a specific carveout memory > > > > configured by MPU for DMA? If so, you should use 'dma-ranges' as suggested by > > > > Rob. > > > > > > > > 'memory-region' also serves the purpose, but for PCI, we have the dedicated > > > > 'dma-ranges' property. > > > > > > I think I need some sort of guide on writing the 'dma-ranges' property, > > > because it is not working for me. > > > > > > I'm adding > > > > > > dma-ranges = <0x42000000 0 0x00000000 0 0xc0000000 0 0x4000000>; > > > > > > > So the device DMA address start from 0x0? Isn't it a 1:1 mapping? > > I actually don't know. But > > > dma-ranges = <0x42000000 0 0xc0000000 0 0xc0000000 0 0x4000000>; > > this didn't work either. Hmm. Can you print the DMA address programmed to the device? i.e., the address returned by dma_map_single() in the driver. Also, using prefetchable flag is not correct for DMA memory. You should use: dma-ranges = <0x02000000 0 0xc0000000 0 0xc0000000 0 0x4000000>; - Mani -- மணிவண்ணன் சதாசிவம்
On Fri, May 15, 2026 at 8:34 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
>
> On Fri, May 15, 2026 at 05:16:19PM +0800, Chen-Yu Tsai wrote:
> > On Thu, May 14, 2026 at 7:48 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
> > >
> > > On Thu, May 14, 2026 at 03:54:29PM +0800, Chen-Yu Tsai wrote:
> > > > On Thu, May 14, 2026 at 1:23 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
> > > > >
> > > > > On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote:
> > > > > > On some SoCs without an IOMMU behind the PCIe controller, the PCIe
> > > > > > controller memory access could be limited to a small region by the
> > > > > > firmware configuring a memory protection unit. This memory region
> > > > > > must be assigned to the PCIe controller so that the OS knows to
> > > > > > use that region. Otherwise PCIe devices would not work properly.
> > > > > >
> > > > >
> > > > > So this means, the PCIe devices can only access a specific carveout memory
> > > > > configured by MPU for DMA? If so, you should use 'dma-ranges' as suggested by
> > > > > Rob.
> > > > >
> > > > > 'memory-region' also serves the purpose, but for PCI, we have the dedicated
> > > > > 'dma-ranges' property.
> > > >
> > > > I think I need some sort of guide on writing the 'dma-ranges' property,
> > > > because it is not working for me.
> > > >
> > > > I'm adding
> > > >
> > > > dma-ranges = <0x42000000 0 0x00000000 0 0xc0000000 0 0x4000000>;
> > > >
> > >
> > > So the device DMA address start from 0x0? Isn't it a 1:1 mapping?
> >
> > I actually don't know. But
> >
> > > dma-ranges = <0x42000000 0 0xc0000000 0 0xc0000000 0 0x4000000>;
> >
> > this didn't work either.
>
>
> Hmm. Can you print the DMA address programmed to the device? i.e., the address
> returned by dma_map_single() in the driver.
On a working system still using the restricted-dma-pool memory region,
it gives something like 0x00000000c0009000, so indeed it is 1:1 mapping?
These are for the RX/TX descriptors [1][2].
When using dma-ranges, the failure is from dma_alloc_coherent() [3][4],
which is the descriptor ring. On a working system, this is something
like 0x00000000c0c9d000, so again 1:1.
[1] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L221
[2] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L829
[3] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L192
[4] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L265
> Also, using prefetchable flag is not correct for DMA memory. You should use:
>
> dma-ranges = <0x02000000 0 0xc0000000 0 0xc0000000 0 0x4000000>;
This didn't work either. What exactly is supposed to handle dma-ranges?
I see some code parsing it in the PCI core, but it just saves it to a list.
Here's a function graph trace for the dma_alloc_coherent() call:
funcgraph_entry: | dma_alloc_attrs() {
funcgraph_entry: 6.538 us | dma_alloc_from_dev_coherent(); (ret=0x0)
funcgraph_entry: | dma_direct_alloc() {
funcgraph_entry: | __dma_direct_alloc_pages.isra.0() {
funcgraph_entry: 4.846 us | dma_alloc_contiguous(); (ret=0x0)
funcgraph_entry: | __alloc_pages_noprof() {
funcgraph_entry: | __alloc_frozen_pages_noprof() {
funcgraph_entry: 5.539 us | fs_reclaim_acquire();
(ret=0xffffff80c7dcd580)
funcgraph_entry: 5.077 us | fs_reclaim_release();
(ret=0xffffff80c7dcd580)
funcgraph_entry: | __might_sleep() {
funcgraph_entry: 5.153 us | __might_resched(); (ret=0x0)
funcgraph_exit: + 16.230 us | } (ret=0x0)
funcgraph_entry: 5.077 us |
__next_zones_zonelist(); (ret=0xffffffd055d598e0)
funcgraph_entry: | get_page_from_freelist() {
funcgraph_entry: | _raw_spin_trylock() {
funcgraph_entry: 5.385 us |
do_raw_spin_trylock(); (ret=0x1)
funcgraph_exit: + 16.923 us | } (ret=0x1)
funcgraph_entry: | _raw_spin_unlock() {
funcgraph_entry: 5.077 us |
do_raw_spin_unlock(); (ret=0x1)
funcgraph_exit: + 16.538 us | } (ret=0x100000001)
funcgraph_exit: + 54.231 us | } (ret=0xfffffffec051c540)
funcgraph_exit: ! 123.462 us | } (ret=0xfffffffec051c540)
funcgraph_exit: ! 134.692 us | } (ret=0xfffffffec051c540)
funcgraph_entry: | __free_pages() {
funcgraph_entry: | ___free_pages() {
funcgraph_entry: | __free_frozen_pages() {
funcgraph_entry: 5.538 us |
__get_pfnblock_flags_mask.isra.0(); (ret=0x0)
funcgraph_entry: | _raw_spin_trylock() {
funcgraph_entry: 5.077 us |
do_raw_spin_trylock(); (ret=0x1)
funcgraph_exit: + 16.538 us | } (ret=0x1)
funcgraph_entry: 5.385 us |
free_frozen_page_commit(); (ret=0x1)
funcgraph_entry: | _raw_spin_unlock() {
funcgraph_entry: 5.077 us |
do_raw_spin_unlock(); (ret=0x1)
funcgraph_exit: + 16.385 us | } (ret=0x100000001)
funcgraph_exit: + 75.000 us | } (ret=0x0)
funcgraph_exit: + 86.230 us | } (ret=0x0)
funcgraph_exit: + 97.384 us | } (ret=0x0)
funcgraph_exit: ! 262.846 us | } (ret=0x0)
funcgraph_exit: ! 274.538 us | } (ret=0x0)
funcgraph_exit: ! 309.077 us | } (ret=0x0)
And here are kernel logs for all the system's memory regions:
Reserved memory: created DMA memory pool at 0x000000013ff00000, size 1 MiB
OF: reserved mem: initialized node audio-dma-pool, compatible id shared-dma-pool
OF: reserved mem: 0x000000013ff00000..0x000000013fffffff (1024 KiB)
nomap non-reusable audio-dma-pool
OF: reserved mem: 0x00000000ffe65000..0x00000000fff64fff (1024 KiB)
map non-reusable ramoops
Reserved memory: created DMA memory pool at 0x0000000050000000, size 41 MiB
OF: reserved mem: initialized node scp@50000000, compatible id shared-dma-pool
OF: reserved mem: 0x0000000050000000..0x00000000528fffff (41984 KiB)
nomap non-reusable scp@50000000
cma: Reserved 16 MiB at 0x00000000c3000000
Zone ranges:
DMA [mem 0x0000000040000000-0x00000000c3ffffff]
DMA32 [mem 0x00000000c4000000-0x00000000ffffffff]
Normal [mem 0x0000000100000000-0x000000013fffffff]
Early memory node ranges
node 0: [mem 0x0000000040000000-0x000000004fffffff]
node 0: [mem 0x0000000050000000-0x00000000528fffff]
node 0: [mem 0x0000000052900000-0x00000000545fffff]
node 0: [mem 0x0000000054700000-0x00000000ffdfffff]
node 0: [mem 0x0000000100000000-0x000000013fefffff]
node 0: [mem 0x000000013ff00000-0x000000013fffffff]
software IO TLB: area num 8.
software IO TLB: mapped [mem 0x00000000bf000000-0x00000000c3000000] (64MB)
So I think it could be that the usable memory has all been given away to
other bits? But then dma_alloc_contiguous() returned NULL.
ChenYu
On Mon, May 18, 2026 at 05:02:11PM +0800, Chen-Yu Tsai wrote:
> On Fri, May 15, 2026 at 8:34 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
> >
> > On Fri, May 15, 2026 at 05:16:19PM +0800, Chen-Yu Tsai wrote:
> > > On Thu, May 14, 2026 at 7:48 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
> > > >
> > > > On Thu, May 14, 2026 at 03:54:29PM +0800, Chen-Yu Tsai wrote:
> > > > > On Thu, May 14, 2026 at 1:23 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
> > > > > >
> > > > > > On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote:
> > > > > > > On some SoCs without an IOMMU behind the PCIe controller, the PCIe
> > > > > > > controller memory access could be limited to a small region by the
> > > > > > > firmware configuring a memory protection unit. This memory region
> > > > > > > must be assigned to the PCIe controller so that the OS knows to
> > > > > > > use that region. Otherwise PCIe devices would not work properly.
> > > > > > >
> > > > > >
> > > > > > So this means, the PCIe devices can only access a specific carveout memory
> > > > > > configured by MPU for DMA? If so, you should use 'dma-ranges' as suggested by
> > > > > > Rob.
> > > > > >
> > > > > > 'memory-region' also serves the purpose, but for PCI, we have the dedicated
> > > > > > 'dma-ranges' property.
> > > > >
> > > > > I think I need some sort of guide on writing the 'dma-ranges' property,
> > > > > because it is not working for me.
> > > > >
> > > > > I'm adding
> > > > >
> > > > > dma-ranges = <0x42000000 0 0x00000000 0 0xc0000000 0 0x4000000>;
> > > > >
> > > >
> > > > So the device DMA address start from 0x0? Isn't it a 1:1 mapping?
> > >
> > > I actually don't know. But
> > >
> > > > dma-ranges = <0x42000000 0 0xc0000000 0 0xc0000000 0 0x4000000>;
> > >
> > > this didn't work either.
> >
> >
> > Hmm. Can you print the DMA address programmed to the device? i.e., the address
> > returned by dma_map_single() in the driver.
>
> On a working system still using the restricted-dma-pool memory region,
> it gives something like 0x00000000c0009000, so indeed it is 1:1 mapping?
It has to be 1:1 mapping.
> These are for the RX/TX descriptors [1][2].
>
> When using dma-ranges, the failure is from dma_alloc_coherent() [3][4],
> which is the descriptor ring. On a working system, this is something
> like 0x00000000c0c9d000, so again 1:1.
>
> [1] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L221
> [2] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L829
> [3] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L192
> [4] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L265
>
> > Also, using prefetchable flag is not correct for DMA memory. You should use:
> >
> > dma-ranges = <0x02000000 0 0xc0000000 0 0xc0000000 0 0x4000000>;
>
> This didn't work either. What exactly is supposed to handle dma-ranges?
> I see some code parsing it in the PCI core, but it just saves it to a list.
>
I think the failure is due to marking the memory as 'reserved' in DT. With
'dma-ranges', the allocator will only ensure that the allocated memory stays
within this limit. But the allocator itself will not use this property to
allocate from the reserved region.
Now, I'm not sure if you can reliably get dma-ranges to work for this usecase
of forcing the dma_alloc_coherent() to use the reserved memory.
So looks like 'memory-region' is your only option here.
- Mani
>
> Here's a function graph trace for the dma_alloc_coherent() call:
>
> funcgraph_entry: | dma_alloc_attrs() {
> funcgraph_entry: 6.538 us | dma_alloc_from_dev_coherent(); (ret=0x0)
> funcgraph_entry: | dma_direct_alloc() {
> funcgraph_entry: | __dma_direct_alloc_pages.isra.0() {
> funcgraph_entry: 4.846 us | dma_alloc_contiguous(); (ret=0x0)
> funcgraph_entry: | __alloc_pages_noprof() {
> funcgraph_entry: | __alloc_frozen_pages_noprof() {
> funcgraph_entry: 5.539 us | fs_reclaim_acquire();
> (ret=0xffffff80c7dcd580)
> funcgraph_entry: 5.077 us | fs_reclaim_release();
> (ret=0xffffff80c7dcd580)
> funcgraph_entry: | __might_sleep() {
> funcgraph_entry: 5.153 us | __might_resched(); (ret=0x0)
> funcgraph_exit: + 16.230 us | } (ret=0x0)
> funcgraph_entry: 5.077 us |
> __next_zones_zonelist(); (ret=0xffffffd055d598e0)
> funcgraph_entry: | get_page_from_freelist() {
> funcgraph_entry: | _raw_spin_trylock() {
> funcgraph_entry: 5.385 us |
> do_raw_spin_trylock(); (ret=0x1)
> funcgraph_exit: + 16.923 us | } (ret=0x1)
> funcgraph_entry: | _raw_spin_unlock() {
> funcgraph_entry: 5.077 us |
> do_raw_spin_unlock(); (ret=0x1)
> funcgraph_exit: + 16.538 us | } (ret=0x100000001)
> funcgraph_exit: + 54.231 us | } (ret=0xfffffffec051c540)
> funcgraph_exit: ! 123.462 us | } (ret=0xfffffffec051c540)
> funcgraph_exit: ! 134.692 us | } (ret=0xfffffffec051c540)
> funcgraph_entry: | __free_pages() {
> funcgraph_entry: | ___free_pages() {
> funcgraph_entry: | __free_frozen_pages() {
> funcgraph_entry: 5.538 us |
> __get_pfnblock_flags_mask.isra.0(); (ret=0x0)
> funcgraph_entry: | _raw_spin_trylock() {
> funcgraph_entry: 5.077 us |
> do_raw_spin_trylock(); (ret=0x1)
> funcgraph_exit: + 16.538 us | } (ret=0x1)
> funcgraph_entry: 5.385 us |
> free_frozen_page_commit(); (ret=0x1)
> funcgraph_entry: | _raw_spin_unlock() {
> funcgraph_entry: 5.077 us |
> do_raw_spin_unlock(); (ret=0x1)
> funcgraph_exit: + 16.385 us | } (ret=0x100000001)
> funcgraph_exit: + 75.000 us | } (ret=0x0)
> funcgraph_exit: + 86.230 us | } (ret=0x0)
> funcgraph_exit: + 97.384 us | } (ret=0x0)
> funcgraph_exit: ! 262.846 us | } (ret=0x0)
> funcgraph_exit: ! 274.538 us | } (ret=0x0)
> funcgraph_exit: ! 309.077 us | } (ret=0x0)
>
>
> And here are kernel logs for all the system's memory regions:
>
> Reserved memory: created DMA memory pool at 0x000000013ff00000, size 1 MiB
> OF: reserved mem: initialized node audio-dma-pool, compatible id shared-dma-pool
> OF: reserved mem: 0x000000013ff00000..0x000000013fffffff (1024 KiB)
> nomap non-reusable audio-dma-pool
> OF: reserved mem: 0x00000000ffe65000..0x00000000fff64fff (1024 KiB)
> map non-reusable ramoops
> Reserved memory: created DMA memory pool at 0x0000000050000000, size 41 MiB
> OF: reserved mem: initialized node scp@50000000, compatible id shared-dma-pool
> OF: reserved mem: 0x0000000050000000..0x00000000528fffff (41984 KiB)
> nomap non-reusable scp@50000000
> cma: Reserved 16 MiB at 0x00000000c3000000
>
> Zone ranges:
> DMA [mem 0x0000000040000000-0x00000000c3ffffff]
> DMA32 [mem 0x00000000c4000000-0x00000000ffffffff]
> Normal [mem 0x0000000100000000-0x000000013fffffff]
>
> Early memory node ranges
> node 0: [mem 0x0000000040000000-0x000000004fffffff]
> node 0: [mem 0x0000000050000000-0x00000000528fffff]
> node 0: [mem 0x0000000052900000-0x00000000545fffff]
> node 0: [mem 0x0000000054700000-0x00000000ffdfffff]
> node 0: [mem 0x0000000100000000-0x000000013fefffff]
> node 0: [mem 0x000000013ff00000-0x000000013fffffff]
>
> software IO TLB: area num 8.
> software IO TLB: mapped [mem 0x00000000bf000000-0x00000000c3000000] (64MB)
>
>
> So I think it could be that the usable memory has all been given away to
> other bits? But then dma_alloc_contiguous() returned NULL.
>
>
> ChenYu
--
மணிவண்ணன் சதாசிவம்
On Tue, May 19, 2026 at 3:21 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
>
> On Mon, May 18, 2026 at 05:02:11PM +0800, Chen-Yu Tsai wrote:
> > On Fri, May 15, 2026 at 8:34 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
> > >
> > > On Fri, May 15, 2026 at 05:16:19PM +0800, Chen-Yu Tsai wrote:
> > > > On Thu, May 14, 2026 at 7:48 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
> > > > >
> > > > > On Thu, May 14, 2026 at 03:54:29PM +0800, Chen-Yu Tsai wrote:
> > > > > > On Thu, May 14, 2026 at 1:23 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
> > > > > > >
> > > > > > > On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote:
> > > > > > > > On some SoCs without an IOMMU behind the PCIe controller, the PCIe
> > > > > > > > controller memory access could be limited to a small region by the
> > > > > > > > firmware configuring a memory protection unit. This memory region
> > > > > > > > must be assigned to the PCIe controller so that the OS knows to
> > > > > > > > use that region. Otherwise PCIe devices would not work properly.
> > > > > > > >
> > > > > > >
> > > > > > > So this means, the PCIe devices can only access a specific carveout memory
> > > > > > > configured by MPU for DMA? If so, you should use 'dma-ranges' as suggested by
> > > > > > > Rob.
> > > > > > >
> > > > > > > 'memory-region' also serves the purpose, but for PCI, we have the dedicated
> > > > > > > 'dma-ranges' property.
> > > > > >
> > > > > > I think I need some sort of guide on writing the 'dma-ranges' property,
> > > > > > because it is not working for me.
> > > > > >
> > > > > > I'm adding
> > > > > >
> > > > > > dma-ranges = <0x42000000 0 0x00000000 0 0xc0000000 0 0x4000000>;
> > > > > >
> > > > >
> > > > > So the device DMA address start from 0x0? Isn't it a 1:1 mapping?
> > > >
> > > > I actually don't know. But
> > > >
> > > > > dma-ranges = <0x42000000 0 0xc0000000 0 0xc0000000 0 0x4000000>;
> > > >
> > > > this didn't work either.
> > >
> > >
> > > Hmm. Can you print the DMA address programmed to the device? i.e., the address
> > > returned by dma_map_single() in the driver.
> >
> > On a working system still using the restricted-dma-pool memory region,
> > it gives something like 0x00000000c0009000, so indeed it is 1:1 mapping?
>
> It has to be 1:1 mapping.
>
> > These are for the RX/TX descriptors [1][2].
> >
> > When using dma-ranges, the failure is from dma_alloc_coherent() [3][4],
> > which is the descriptor ring. On a working system, this is something
> > like 0x00000000c0c9d000, so again 1:1.
> >
> > [1] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L221
> > [2] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L829
> > [3] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L192
> > [4] https://elixir.bootlin.com/linux/v7.0.8/source/drivers/net/wireless/realtek/rtw88/pci.c#L265
> >
> > > Also, using prefetchable flag is not correct for DMA memory. You should use:
> > >
> > > dma-ranges = <0x02000000 0 0xc0000000 0 0xc0000000 0 0x4000000>;
> >
> > This didn't work either. What exactly is supposed to handle dma-ranges?
> > I see some code parsing it in the PCI core, but it just saves it to a list.
> >
>
> I think the failure is due to marking the memory as 'reserved' in DT. With
> 'dma-ranges', the allocator will only ensure that the allocated memory stays
> within this limit. But the allocator itself will not use this property to
> allocate from the reserved region.
It didn't work with the reserved regions removed either, since CMA and
SWIOTLB take up the space by coincidence.
> Now, I'm not sure if you can reliably get dma-ranges to work for this usecase
> of forcing the dma_alloc_coherent() to use the reserved memory.
Well I think that would be a bit sketchy. But we do want the reserved
memory, as the whole point of limiting PCIe DMA to that region is to
isolate the DMA, so we don't want the system using it for something
else and potentially getting overriden by some rogue PCIe device.
> So looks like 'memory-region' is your only option here.
Thanks. Hopefully Rob understands and gives an ack for the DT binding
change.
ChenYu
> - Mani
>
> >
> > Here's a function graph trace for the dma_alloc_coherent() call:
> >
> > funcgraph_entry: | dma_alloc_attrs() {
> > funcgraph_entry: 6.538 us | dma_alloc_from_dev_coherent(); (ret=0x0)
> > funcgraph_entry: | dma_direct_alloc() {
> > funcgraph_entry: | __dma_direct_alloc_pages.isra.0() {
> > funcgraph_entry: 4.846 us | dma_alloc_contiguous(); (ret=0x0)
> > funcgraph_entry: | __alloc_pages_noprof() {
> > funcgraph_entry: | __alloc_frozen_pages_noprof() {
> > funcgraph_entry: 5.539 us | fs_reclaim_acquire();
> > (ret=0xffffff80c7dcd580)
> > funcgraph_entry: 5.077 us | fs_reclaim_release();
> > (ret=0xffffff80c7dcd580)
> > funcgraph_entry: | __might_sleep() {
> > funcgraph_entry: 5.153 us | __might_resched(); (ret=0x0)
> > funcgraph_exit: + 16.230 us | } (ret=0x0)
> > funcgraph_entry: 5.077 us |
> > __next_zones_zonelist(); (ret=0xffffffd055d598e0)
> > funcgraph_entry: | get_page_from_freelist() {
> > funcgraph_entry: | _raw_spin_trylock() {
> > funcgraph_entry: 5.385 us |
> > do_raw_spin_trylock(); (ret=0x1)
> > funcgraph_exit: + 16.923 us | } (ret=0x1)
> > funcgraph_entry: | _raw_spin_unlock() {
> > funcgraph_entry: 5.077 us |
> > do_raw_spin_unlock(); (ret=0x1)
> > funcgraph_exit: + 16.538 us | } (ret=0x100000001)
> > funcgraph_exit: + 54.231 us | } (ret=0xfffffffec051c540)
> > funcgraph_exit: ! 123.462 us | } (ret=0xfffffffec051c540)
> > funcgraph_exit: ! 134.692 us | } (ret=0xfffffffec051c540)
> > funcgraph_entry: | __free_pages() {
> > funcgraph_entry: | ___free_pages() {
> > funcgraph_entry: | __free_frozen_pages() {
> > funcgraph_entry: 5.538 us |
> > __get_pfnblock_flags_mask.isra.0(); (ret=0x0)
> > funcgraph_entry: | _raw_spin_trylock() {
> > funcgraph_entry: 5.077 us |
> > do_raw_spin_trylock(); (ret=0x1)
> > funcgraph_exit: + 16.538 us | } (ret=0x1)
> > funcgraph_entry: 5.385 us |
> > free_frozen_page_commit(); (ret=0x1)
> > funcgraph_entry: | _raw_spin_unlock() {
> > funcgraph_entry: 5.077 us |
> > do_raw_spin_unlock(); (ret=0x1)
> > funcgraph_exit: + 16.385 us | } (ret=0x100000001)
> > funcgraph_exit: + 75.000 us | } (ret=0x0)
> > funcgraph_exit: + 86.230 us | } (ret=0x0)
> > funcgraph_exit: + 97.384 us | } (ret=0x0)
> > funcgraph_exit: ! 262.846 us | } (ret=0x0)
> > funcgraph_exit: ! 274.538 us | } (ret=0x0)
> > funcgraph_exit: ! 309.077 us | } (ret=0x0)
> >
> >
> > And here are kernel logs for all the system's memory regions:
> >
> > Reserved memory: created DMA memory pool at 0x000000013ff00000, size 1 MiB
> > OF: reserved mem: initialized node audio-dma-pool, compatible id shared-dma-pool
> > OF: reserved mem: 0x000000013ff00000..0x000000013fffffff (1024 KiB)
> > nomap non-reusable audio-dma-pool
> > OF: reserved mem: 0x00000000ffe65000..0x00000000fff64fff (1024 KiB)
> > map non-reusable ramoops
> > Reserved memory: created DMA memory pool at 0x0000000050000000, size 41 MiB
> > OF: reserved mem: initialized node scp@50000000, compatible id shared-dma-pool
> > OF: reserved mem: 0x0000000050000000..0x00000000528fffff (41984 KiB)
> > nomap non-reusable scp@50000000
> > cma: Reserved 16 MiB at 0x00000000c3000000
> >
> > Zone ranges:
> > DMA [mem 0x0000000040000000-0x00000000c3ffffff]
> > DMA32 [mem 0x00000000c4000000-0x00000000ffffffff]
> > Normal [mem 0x0000000100000000-0x000000013fffffff]
> >
> > Early memory node ranges
> > node 0: [mem 0x0000000040000000-0x000000004fffffff]
> > node 0: [mem 0x0000000050000000-0x00000000528fffff]
> > node 0: [mem 0x0000000052900000-0x00000000545fffff]
> > node 0: [mem 0x0000000054700000-0x00000000ffdfffff]
> > node 0: [mem 0x0000000100000000-0x000000013fefffff]
> > node 0: [mem 0x000000013ff00000-0x000000013fffffff]
> >
> > software IO TLB: area num 8.
> > software IO TLB: mapped [mem 0x00000000bf000000-0x00000000c3000000] (64MB)
> >
> >
> > So I think it could be that the usable memory has all been given away to
> > other bits? But then dma_alloc_contiguous() returned NULL.
> >
> >
> > ChenYu
>
> --
> மணிவண்ணன் சதாசிவம்
On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote: > On some SoCs without an IOMMU behind the PCIe controller, the PCIe > controller memory access could be limited to a small region by the > firmware configuring a memory protection unit. This memory region > must be assigned to the PCIe controller so that the OS knows to > use that region. Otherwise PCIe devices would not work properly. > What you are describing is dma-ranges. Why not use that? > Allow the memory-region property with one item pointing to a > restricted DMA buffer. > > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> > --- > This patch compliments another patch that moved the memory-region from > the PCIe device to the PCIe controller [1]. > > [1] https://lore.kernel.org/all/20260430120725.241779-1-wenst@chromium.org/ > > Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > index 4db700fc36ba..4a9e41d01628 100644 > --- a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > +++ b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > @@ -115,6 +115,10 @@ properties: > power-domains: > maxItems: 1 > > + memory-region: > + maxItems: 1 > + description: phandle to restricted DMA buffer > + > mediatek,pbus-csr: > $ref: /schemas/types.yaml#/definitions/phandle-array > items: > -- > 2.54.0.563.g4f69b47b94-goog >
On Thu, May 14, 2026 at 7:15 AM Rob Herring <robh@kernel.org> wrote: > > On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote: > > On some SoCs without an IOMMU behind the PCIe controller, the PCIe > > controller memory access could be limited to a small region by the > > firmware configuring a memory protection unit. This memory region > > must be assigned to the PCIe controller so that the OS knows to > > use that region. Otherwise PCIe devices would not work properly. > > > > What you are describing is dma-ranges. Why not use that? Answer from yesterday: I didn't know about it. I was just moving the property from the WiFi controller node down to the PCIe controller in the other DT patch [1]. Answer from today: Also, it doesn't work. See my reply to Mani. ChenYu > > Allow the memory-region property with one item pointing to a > > restricted DMA buffer. > > > > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> > > --- > > This patch compliments another patch that moved the memory-region from > > the PCIe device to the PCIe controller [1]. > > > > [1] https://lore.kernel.org/all/20260430120725.241779-1-wenst@chromium.org/ > > > > Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > > index 4db700fc36ba..4a9e41d01628 100644 > > --- a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > > +++ b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > > @@ -115,6 +115,10 @@ properties: > > power-domains: > > maxItems: 1 > > > > + memory-region: > > + maxItems: 1 > > + description: phandle to restricted DMA buffer > > + > > mediatek,pbus-csr: > > $ref: /schemas/types.yaml#/definitions/phandle-array > > items: > > -- > > 2.54.0.563.g4f69b47b94-goog > >
On 5/8/26 08:36, Chen-Yu Tsai wrote: > On some SoCs without an IOMMU behind the PCIe controller, the PCIe > controller memory access could be limited to a small region by the > firmware configuring a memory protection unit. This memory region > must be assigned to the PCIe controller so that the OS knows to > use that region. Otherwise PCIe devices would not work properly. > > Allow the memory-region property with one item pointing to a > restricted DMA buffer. > > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Makes a lot of sense, and actually makes us able to provide a correct hardware description in the devicetrees. Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote: > On some SoCs without an IOMMU behind the PCIe controller, the PCIe > controller memory access could be limited to a small region by the > firmware configuring a memory protection unit. This memory region > must be assigned to the PCIe controller so that the OS knows to > use that region. Otherwise PCIe devices would not work properly. > > Allow the memory-region property with one item pointing to a > restricted DMA buffer. > > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> > --- > This patch compliments another patch that moved the memory-region from > the PCIe device to the PCIe controller [1]. > > [1] https://lore.kernel.org/all/20260430120725.241779-1-wenst@chromium.org/ > > Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > index 4db700fc36ba..4a9e41d01628 100644 > --- a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > +++ b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > @@ -115,6 +115,10 @@ properties: > power-domains: > maxItems: 1 > > + memory-region: > + maxItems: 1 > + description: phandle to restricted DMA buffer I guess this is similar to https://lore.kernel.org/linux-pci/20250716053950.199079-1-huaqian.li@siemens.com/ and uses https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/reserved-memory/shared-dma-pool.yaml? Looks like those keystone changes were never merged; I don't know what happened to them. But it will be good if everybody does it the same way. I wish there were a simple way to grep for this restricted DMA concept. Maybe there is and I just haven't found it :) > mediatek,pbus-csr: > $ref: /schemas/types.yaml#/definitions/phandle-array > items: > -- > 2.54.0.563.g4f69b47b94-goog >
On Sat, May 9, 2026 at 1:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Fri, May 08, 2026 at 02:36:32PM +0800, Chen-Yu Tsai wrote: > > On some SoCs without an IOMMU behind the PCIe controller, the PCIe > > controller memory access could be limited to a small region by the > > firmware configuring a memory protection unit. This memory region > > must be assigned to the PCIe controller so that the OS knows to > > use that region. Otherwise PCIe devices would not work properly. > > > > Allow the memory-region property with one item pointing to a > > restricted DMA buffer. > > > > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> > > --- > > This patch compliments another patch that moved the memory-region from > > the PCIe device to the PCIe controller [1]. > > > > [1] https://lore.kernel.org/all/20260430120725.241779-1-wenst@chromium.org/ > > > > Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > > index 4db700fc36ba..4a9e41d01628 100644 > > --- a/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > > +++ b/Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml > > @@ -115,6 +115,10 @@ properties: > > power-domains: > > maxItems: 1 > > > > + memory-region: > > + maxItems: 1 > > + description: phandle to restricted DMA buffer > > I guess this is similar to > https://lore.kernel.org/linux-pci/20250716053950.199079-1-huaqian.li@siemens.com/ > and uses > https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/reserved-memory/shared-dma-pool.yaml? Correct. > Looks like those keystone changes were never merged; I don't know what > happened to them. But it will be good if everybody does it the same > way. Not sure what you mean. "dt-bindings: PCI: ti,am65: Extend for use with PVU" was merged as commit 57a48a2619c5. Maybe you are referring to the last patch in that series that adds a DT overlay? > I wish there were a simple way to grep for this restricted DMA > concept. Maybe there is and I just haven't found it :) Probably just grepping for "restricted" in the PCI bindings. :| ChenYu > > mediatek,pbus-csr: > > $ref: /schemas/types.yaml#/definitions/phandle-array > > items: > > -- > > 2.54.0.563.g4f69b47b94-goog > >
© 2016 - 2026 Red Hat, Inc.