[PATCH RFC v6 00/26] nova-core: Memory management infrastructure (v6)

Joel Fernandes posted 26 patches 2 weeks, 3 days ago
Documentation/gpu/drm-mm.rst                  |   10 +-
Documentation/gpu/nova/core/pramin.rst        |  125 ++
Documentation/gpu/nova/index.rst              |    1 +
MAINTAINERS                                   |    7 +
drivers/gpu/Kconfig                           |   13 +
drivers/gpu/Makefile                          |    2 +
drivers/gpu/buddy.c                           | 1310 +++++++++++++++++
drivers/gpu/drm/Kconfig                       |    1 +
drivers/gpu/drm/Kconfig.debug                 |    4 +-
drivers/gpu/drm/amd/amdgpu/Kconfig            |    1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |    2 +-
.../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h    |   12 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  |   80 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h  |   20 +-
drivers/gpu/drm/drm_buddy.c                   | 1284 +---------------
drivers/gpu/drm/i915/Kconfig                  |    1 +
drivers/gpu/drm/i915/i915_scatterlist.c       |   10 +-
drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |   55 +-
drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |    6 +-
.../drm/i915/selftests/intel_memory_region.c  |   20 +-
drivers/gpu/drm/tests/Makefile                |    1 -
.../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |    5 +-
drivers/gpu/drm/ttm/tests/ttm_mock_manager.c  |   18 +-
drivers/gpu/drm/ttm/tests/ttm_mock_manager.h  |    4 +-
drivers/gpu/drm/xe/Kconfig                    |    1 +
drivers/gpu/drm/xe/xe_res_cursor.h            |   34 +-
drivers/gpu/drm/xe/xe_svm.c                   |   12 +-
drivers/gpu/drm/xe/xe_ttm_vram_mgr.c          |   73 +-
drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h    |    4 +-
drivers/gpu/nova-core/Kconfig                 |   22 +
drivers/gpu/nova-core/driver.rs               |    9 +-
drivers/gpu/nova-core/fb.rs                   |   23 +-
drivers/gpu/nova-core/gpu.rs                  |  140 +-
drivers/gpu/nova-core/gsp/boot.rs             |   22 +-
drivers/gpu/nova-core/gsp/commands.rs         |   18 +-
drivers/gpu/nova-core/gsp/fw/commands.rs      |   38 +
drivers/gpu/nova-core/mm/bar_user.rs          |  336 +++++
drivers/gpu/nova-core/mm/mod.rs               |  209 +++
drivers/gpu/nova-core/mm/pagetable/mod.rs     |  377 +++++
drivers/gpu/nova-core/mm/pagetable/ver2.rs    |  184 +++
drivers/gpu/nova-core/mm/pagetable/ver3.rs    |  286 ++++
drivers/gpu/nova-core/mm/pagetable/walk.rs    |  285 ++++
drivers/gpu/nova-core/mm/pramin.rs            |  404 +++++
drivers/gpu/nova-core/mm/tlb.rs               |   79 +
drivers/gpu/nova-core/mm/vmm.rs               |  247 ++++
drivers/gpu/nova-core/nova_core.rs            |    1 +
drivers/gpu/nova-core/regs.rs                 |   38 +
drivers/gpu/tests/Makefile                    |    3 +
.../gpu_buddy_test.c}                         |  390 ++---
drivers/gpu/tests/gpu_random.c                |   48 +
drivers/gpu/tests/gpu_random.h                |   28 +
drivers/video/Kconfig                         |    2 +
include/drm/drm_buddy.h                       |  163 +-
include/linux/gpu_buddy.h                     |  177 +++
rust/bindings/bindings_helper.h               |   11 +
rust/helpers/gpu.c                            |   23 +
rust/helpers/helpers.c                        |    2 +
rust/helpers/list.c                           |   12 +
rust/kernel/clist.rs                          |  357 +++++
rust/kernel/gpu/buddy.rs                      |  538 +++++++
rust/kernel/gpu/mod.rs                        |    5 +
rust/kernel/lib.rs                            |    3 +
62 files changed, 5788 insertions(+), 1808 deletions(-)
create mode 100644 Documentation/gpu/nova/core/pramin.rst
create mode 100644 drivers/gpu/Kconfig
create mode 100644 drivers/gpu/buddy.c
create mode 100644 drivers/gpu/nova-core/mm/bar_user.rs
create mode 100644 drivers/gpu/nova-core/mm/mod.rs
create mode 100644 drivers/gpu/nova-core/mm/pagetable/mod.rs
create mode 100644 drivers/gpu/nova-core/mm/pagetable/ver2.rs
create mode 100644 drivers/gpu/nova-core/mm/pagetable/ver3.rs
create mode 100644 drivers/gpu/nova-core/mm/pagetable/walk.rs
create mode 100644 drivers/gpu/nova-core/mm/pramin.rs
create mode 100644 drivers/gpu/nova-core/mm/tlb.rs
create mode 100644 drivers/gpu/nova-core/mm/vmm.rs
create mode 100644 drivers/gpu/tests/Makefile
rename drivers/gpu/{drm/tests/drm_buddy_test.c => tests/gpu_buddy_test.c} (68%)
create mode 100644 drivers/gpu/tests/gpu_random.c
create mode 100644 drivers/gpu/tests/gpu_random.h
create mode 100644 include/linux/gpu_buddy.h
create mode 100644 rust/helpers/gpu.c
create mode 100644 rust/helpers/list.c
create mode 100644 rust/kernel/clist.rs
create mode 100644 rust/kernel/gpu/buddy.rs
create mode 100644 rust/kernel/gpu/mod.rs
[PATCH RFC v6 00/26] nova-core: Memory management infrastructure (v6)
Posted by Joel Fernandes 2 weeks, 3 days ago
This series is rebased on drm-rust-kernel/drm-rust-next and provides memory
management infrastructure for the nova-core GPU driver. It combines several
previous series and provides a foundation for nova GPU memory management
including page tables, virtual memory management, and BAR mapping. All these
are critical nova-core features.

The series includes:
- A Rust module (CList) to interface with C circular linked lists, required
  for iterating over buddy allocator blocks.
- Movement of the DRM buddy allocator up to drivers/gpu/ level, renamed to GPU buddy.
- Rust bindings for the GPU buddy allocator.
- PRAMIN aperture support for direct VRAM access.
- Page table types for MMU v2 and v3 formats.
- Virtual Memory Manager (VMM) for GPU virtual address space management.
- BAR1 user interface for mapping access GPU via virtual memory.
- Selftests for PRAMIN and BAR1 user interface (disabled by default).

Changes from v5 to v6:
- Rebased on drm-rust-kernel/drm-rust-next
- Added page table types and page table walker infrastructure
- Added Virtual Memory Manager (VMM)
- Added BAR1 user interface
- Added TLB flush support
- Added GpuMm memory manager
- Extended to 26 patches from 6 (full mm infrastructure now included)

The git tree with all patches can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (tag: nova-mm-v6-20260120)

Link to v5: https://lore.kernel.org/all/20251219203805.1246586-1-joelagnelf@nvidia.com/

Previous series that are combined:
- v4 (clist + buddy): https://lore.kernel.org/all/20251204215129.2357292-1-joelagnelf@nvidia.com/
- v3 (clist only): https://lore.kernel.org/all/20251129213056.4021375-1-joelagnelf@nvidia.com/
- v2 (clist only): https://lore.kernel.org/all/20251111171315.2196103-4-joelagnelf@nvidia.com/
- clist RFC (original with buddy): https://lore.kernel.org/all/20251030190613.1224287-1-joelagnelf@nvidia.com/
- DRM buddy move: https://lore.kernel.org/all/20251124234432.1988476-1-joelagnelf@nvidia.com/
- PRAMIN series: https://lore.kernel.org/all/20251020185539.49986-1-joelagnelf@nvidia.com/

Joel Fernandes (26):
  rust: clist: Add support to interface with C linked lists
  gpu: Move DRM buddy allocator one level up
  rust: gpu: Add GPU buddy allocator bindings
  nova-core: mm: Select GPU_BUDDY for VRAM allocation
  nova-core: mm: Add support to use PRAMIN windows to write to VRAM
  docs: gpu: nova-core: Document the PRAMIN aperture mechanism
  nova-core: Add BAR1 aperture type and size constant
  nova-core: gsp: Add BAR1 PDE base accessors
  nova-core: mm: Add common memory management types
  nova-core: mm: Add common types for all page table formats
  nova-core: mm: Add MMU v2 page table types
  nova-core: mm: Add MMU v3 page table types
  nova-core: mm: Add unified page table entry wrapper enums
  nova-core: mm: Add TLB flush support
  nova-core: mm: Add GpuMm centralized memory manager
  nova-core: mm: Add page table walker for MMU v2
  nova-core: mm: Add Virtual Memory Manager
  nova-core: mm: Add virtual address range tracking to VMM
  nova-core: mm: Add BAR1 user interface
  nova-core: gsp: Return GspStaticInfo and FbLayout from boot()
  nova-core: mm: Add memory management self-tests
  nova-core: mm: Add PRAMIN aperture self-tests
  nova-core: gsp: Extract usable FB region from GSP
  nova-core: fb: Add usable_vram field to FbLayout
  nova-core: mm: Use usable VRAM region for buddy allocator
  nova-core: mm: Add BarUser to struct Gpu and create at boot

 Documentation/gpu/drm-mm.rst                  |   10 +-
 Documentation/gpu/nova/core/pramin.rst        |  125 ++
 Documentation/gpu/nova/index.rst              |    1 +
 MAINTAINERS                                   |    7 +
 drivers/gpu/Kconfig                           |   13 +
 drivers/gpu/Makefile                          |    2 +
 drivers/gpu/buddy.c                           | 1310 +++++++++++++++++
 drivers/gpu/drm/Kconfig                       |    1 +
 drivers/gpu/drm/Kconfig.debug                 |    4 +-
 drivers/gpu/drm/amd/amdgpu/Kconfig            |    1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       |    2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h    |   12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  |   80 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h  |   20 +-
 drivers/gpu/drm/drm_buddy.c                   | 1284 +---------------
 drivers/gpu/drm/i915/Kconfig                  |    1 +
 drivers/gpu/drm/i915/i915_scatterlist.c       |   10 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |   55 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |    6 +-
 .../drm/i915/selftests/intel_memory_region.c  |   20 +-
 drivers/gpu/drm/tests/Makefile                |    1 -
 .../gpu/drm/ttm/tests/ttm_bo_validate_test.c  |    5 +-
 drivers/gpu/drm/ttm/tests/ttm_mock_manager.c  |   18 +-
 drivers/gpu/drm/ttm/tests/ttm_mock_manager.h  |    4 +-
 drivers/gpu/drm/xe/Kconfig                    |    1 +
 drivers/gpu/drm/xe/xe_res_cursor.h            |   34 +-
 drivers/gpu/drm/xe/xe_svm.c                   |   12 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c          |   73 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h    |    4 +-
 drivers/gpu/nova-core/Kconfig                 |   22 +
 drivers/gpu/nova-core/driver.rs               |    9 +-
 drivers/gpu/nova-core/fb.rs                   |   23 +-
 drivers/gpu/nova-core/gpu.rs                  |  140 +-
 drivers/gpu/nova-core/gsp/boot.rs             |   22 +-
 drivers/gpu/nova-core/gsp/commands.rs         |   18 +-
 drivers/gpu/nova-core/gsp/fw/commands.rs      |   38 +
 drivers/gpu/nova-core/mm/bar_user.rs          |  336 +++++
 drivers/gpu/nova-core/mm/mod.rs               |  209 +++
 drivers/gpu/nova-core/mm/pagetable/mod.rs     |  377 +++++
 drivers/gpu/nova-core/mm/pagetable/ver2.rs    |  184 +++
 drivers/gpu/nova-core/mm/pagetable/ver3.rs    |  286 ++++
 drivers/gpu/nova-core/mm/pagetable/walk.rs    |  285 ++++
 drivers/gpu/nova-core/mm/pramin.rs            |  404 +++++
 drivers/gpu/nova-core/mm/tlb.rs               |   79 +
 drivers/gpu/nova-core/mm/vmm.rs               |  247 ++++
 drivers/gpu/nova-core/nova_core.rs            |    1 +
 drivers/gpu/nova-core/regs.rs                 |   38 +
 drivers/gpu/tests/Makefile                    |    3 +
 .../gpu_buddy_test.c}                         |  390 ++---
 drivers/gpu/tests/gpu_random.c                |   48 +
 drivers/gpu/tests/gpu_random.h                |   28 +
 drivers/video/Kconfig                         |    2 +
 include/drm/drm_buddy.h                       |  163 +-
 include/linux/gpu_buddy.h                     |  177 +++
 rust/bindings/bindings_helper.h               |   11 +
 rust/helpers/gpu.c                            |   23 +
 rust/helpers/helpers.c                        |    2 +
 rust/helpers/list.c                           |   12 +
 rust/kernel/clist.rs                          |  357 +++++
 rust/kernel/gpu/buddy.rs                      |  538 +++++++
 rust/kernel/gpu/mod.rs                        |    5 +
 rust/kernel/lib.rs                            |    3 +
 62 files changed, 5788 insertions(+), 1808 deletions(-)
 create mode 100644 Documentation/gpu/nova/core/pramin.rst
 create mode 100644 drivers/gpu/Kconfig
 create mode 100644 drivers/gpu/buddy.c
 create mode 100644 drivers/gpu/nova-core/mm/bar_user.rs
 create mode 100644 drivers/gpu/nova-core/mm/mod.rs
 create mode 100644 drivers/gpu/nova-core/mm/pagetable/mod.rs
 create mode 100644 drivers/gpu/nova-core/mm/pagetable/ver2.rs
 create mode 100644 drivers/gpu/nova-core/mm/pagetable/ver3.rs
 create mode 100644 drivers/gpu/nova-core/mm/pagetable/walk.rs
 create mode 100644 drivers/gpu/nova-core/mm/pramin.rs
 create mode 100644 drivers/gpu/nova-core/mm/tlb.rs
 create mode 100644 drivers/gpu/nova-core/mm/vmm.rs
 create mode 100644 drivers/gpu/tests/Makefile
 rename drivers/gpu/{drm/tests/drm_buddy_test.c => tests/gpu_buddy_test.c} (68%)
 create mode 100644 drivers/gpu/tests/gpu_random.c
 create mode 100644 drivers/gpu/tests/gpu_random.h
 create mode 100644 include/linux/gpu_buddy.h
 create mode 100644 rust/helpers/gpu.c
 create mode 100644 rust/helpers/list.c
 create mode 100644 rust/kernel/clist.rs
 create mode 100644 rust/kernel/gpu/buddy.rs
 create mode 100644 rust/kernel/gpu/mod.rs


base-commit: 6ea52b6d8f33ae627f4dcf43b12b6e713a8b9331
-- 
2.34.1
Re: [PATCH RFC v6 00/26] nova-core: Memory management infrastructure (v6)
Posted by Danilo Krummrich 1 week, 2 days ago
On Tue Jan 20, 2026 at 9:42 PM CET, Joel Fernandes wrote:
> This series is rebased on drm-rust-kernel/drm-rust-next and provides memory
> management infrastructure for the nova-core GPU driver. It combines several
> previous series and provides a foundation for nova GPU memory management
> including page tables, virtual memory management, and BAR mapping. All these
> are critical nova-core features.

Thanks for this work, I will go through the series soon. (Although it would also
be nice to have what I mention below addressed first.)

> The series includes:
> - A Rust module (CList) to interface with C circular linked lists, required
>   for iterating over buddy allocator blocks.
> - Movement of the DRM buddy allocator up to drivers/gpu/ level, renamed to GPU buddy.
> - Rust bindings for the GPU buddy allocator.
> - PRAMIN aperture support for direct VRAM access.
> - Page table types for MMU v2 and v3 formats.
> - Virtual Memory Manager (VMM) for GPU virtual address space management.
> - BAR1 user interface for mapping access GPU via virtual memory.
> - Selftests for PRAMIN and BAR1 user interface (disabled by default).
>
> Changes from v5 to v6:
> - Rebased on drm-rust-kernel/drm-rust-next
> - Added page table types and page table walker infrastructure
> - Added Virtual Memory Manager (VMM)
> - Added BAR1 user interface
> - Added TLB flush support
> - Added GpuMm memory manager
> - Extended to 26 patches from 6 (full mm infrastructure now included)
>
> The git tree with all patches can be found at:
> git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (tag: nova-mm-v6-20260120)
>
> Link to v5: https://lore.kernel.org/all/20251219203805.1246586-1-joelagnelf@nvidia.com/
>
> Previous series that are combined:
> - v4 (clist + buddy): https://lore.kernel.org/all/20251204215129.2357292-1-joelagnelf@nvidia.com/
> - v3 (clist only): https://lore.kernel.org/all/20251129213056.4021375-1-joelagnelf@nvidia.com/
> - v2 (clist only): https://lore.kernel.org/all/20251111171315.2196103-4-joelagnelf@nvidia.com/
> - clist RFC (original with buddy): https://lore.kernel.org/all/20251030190613.1224287-1-joelagnelf@nvidia.com/
> - DRM buddy move: https://lore.kernel.org/all/20251124234432.1988476-1-joelagnelf@nvidia.com/
> - PRAMIN series: https://lore.kernel.org/all/20251020185539.49986-1-joelagnelf@nvidia.com/

I'm not overly happy with this version history. I understand that you are
building things on top of each other, but going back and forth with adding and
removing features from a series is confusing and makes it hard to keep track of
things.

(In the worst case it may even result in reviewers skipping over it leaving you
with no progress eventually.)

I.e. you stared with a CList and DRM buddy RFC, then DRM buddy disappeared for a
few versions and came back eventually. Then, in the next version, the PRAMIN
stuff came back in, which also had a predecessor series already and now you
added lots of MM stuff on top of it.

The whole version history is about what features and patches were added and
removed to/from the series, rather than about what actually changed design wise
and code wise between the iterations (which is the important part for reviewers
and maintainers).

I also think it is confusing that a lot of the patches in this series have never
been posted before, yet they are labeled as v6 of this RFC.

Hence, please separate the features from each other in separate patch series,
with their own proper version history and changelog. In order to account for the
dependencies, you can just mention them in the cover letter and add a link to
the other related patch series, which should be sufficient for people interested
in the full picture.

I think the most clean approach would probably be a split with CList, DRM buddy
and Nova MM stuff.

And just to clarify, in the end I do not care too much about whether it's all in
a single series or split up, but going back and forth with combining things that
once have been separate and have a separate history doesn't work out well.
Re: [PATCH RFC v6 00/26] nova-core: Memory management infrastructure (v6)
Posted by Joel Fernandes 1 week, 2 days ago
On Jan 28, 2026, at 6:38 AM, Danilo Krummrich <dakr@kernel.org> wrote:
> On Tue Jan 20, 2026 at 9:42 PM CET, Joel Fernandes wrote:
>> This series is rebased on drm-rust-kernel/drm-rust-next and provides memory
>> management infrastructure for the nova-core GPU driver. It combines several
>> previous series and provides a foundation for nova GPU memory management
>> including page tables, virtual memory management, and BAR mapping. All these
>> are critical nova-core features.
>
> Thanks for this work, I will go through the series soon. (Although it would also
> be nice to have what I mention below addressed first.)

Thanks, I appreciate that.

> I'm not overly happy with this version history. I understand that you are
> building things on top of each other, but going back and forth with adding and
> removing features from a series is confusing and makes it hard to keep track of
> things.
>
> (In the worst case it may even result in reviewers skipping over it leaving you
> with no progress eventually.)
>
> [...]
>
> Hence, please separate the features from each other in separate patch series,
> with their own proper version history and changelog. In order to account for the
> dependencies, you can just mention them in the cover letter and add a link to
> the other related patch series, which should be sufficient for people interested
> in the full picture.
>
> I think the most clean approach would probably be a split with CList, DRM buddy
> and Nova MM stuff.
>
> And just to clarify, in the end I do not care too much about whether it's all in
> a single series or split up, but going back and forth with combining things that
> once have been separate and have a separate history doesn't work out well.

I understand the concern, and I appreciate you taking the time to explain. Let
me provide some context on how we ended up here, as it may help clarify the
situation.

1. This is a multi-month undertaking with many interdependencies. It is
   difficult to predict which patches will come to exist, the optimal order, how to split, which series
   first, or what pieces are missing. This is similar to the evolution of nova
   itself - complex interdependencies make it hard to predict what will be
   needed. Rather than waiting months for a perfect plan before posting
   anything, I chose to iterate publicly.

2. The decision to move GPU buddy out of DRM came later in the process [1].
   This significantly changed the scope, requiring a much larger patch to
   handle the buddy infrastructure that everything else depends on.

3. The decision to separate buddy from the CList series came from wanting to
   make progress on CList independently [2]. That effort alone took almost a
   month with several rewrites based on feedback from  others.

4. There was some back and forth on whether to post code with users or code
   that could potentially be used. This influenced the decision to combine
   things into the same series to demonstrate working functionality.

5. The memory management code only became functional around v3. Page table
   walking turned out to be tricky, and I did not have a proper user at that
   time. Eventually I realized BAR1 is a strong use case for page table
   translation, so I added support for that.

Regarding splitting the series: that makes sense, I will split into CList, GPU
buddy, and Nova MM as you suggest. You make a fair point about the versioning
too - labeling new patches (even though most are old) as v6 is confusing. One question: what version
numbers should each split series use? CList was at v3 before being combined,
and similar story for GPU buddy and Nova MM. Should I continue from the last
version number they were posted with, or continue from v6?

[1] https://lore.kernel.org/all/20251124234432.1988476-1-joelagnelf@nvidia.com/
[2] https://lore.kernel.org/all/20251129213056.4021375-1-joelagnelf@nvidia.com/

--
Joel Fernandes
Re: [PATCH RFC v6 00/26] nova-core: Memory management infrastructure (v6)
Posted by Danilo Krummrich 1 week, 2 days ago
On Wed Jan 28, 2026 at 1:44 PM CET, Joel Fernandes wrote:
> I will split into CList, GPU buddy, and Nova MM as you suggest.

Thanks, together with a proper changelog this will help a lot.

> One question: what version numbers should each split series use? CList was at
> v3 before being combined, and similar story for GPU buddy and Nova MM. Should
> I continue from the last version number they were posted with, or continue
> from v6?

I'd say from the last version is probably best. Maybe you also want to move out
of the RFC stage for some of them.

Thanks,
Danilo