[PATCH v9 00/31] gpu: nova-core: firmware: Hopper/Blackwell support

John Hubbard posted 31 patches 6 days, 22 hours ago
drivers/gpu/nova-core/driver.rs          |  24 +-
drivers/gpu/nova-core/falcon.rs          |   1 +
drivers/gpu/nova-core/falcon/fsp.rs      | 220 ++++++++++
drivers/gpu/nova-core/falcon/hal.rs      |  21 +-
drivers/gpu/nova-core/fb.rs              |  26 +-
drivers/gpu/nova-core/fb/hal.rs          |  36 +-
drivers/gpu/nova-core/fb/hal/ga102.rs    |   2 +-
drivers/gpu/nova-core/fb/hal/gb100.rs    |  75 ++++
drivers/gpu/nova-core/fb/hal/gb202.rs    |  62 +++
drivers/gpu/nova-core/fb/hal/gh100.rs    |  38 ++
drivers/gpu/nova-core/firmware.rs        | 204 +++++++++
drivers/gpu/nova-core/firmware/booter.rs |  30 ++
drivers/gpu/nova-core/firmware/fsp.rs    |  44 ++
drivers/gpu/nova-core/firmware/gsp.rs    | 126 ++----
drivers/gpu/nova-core/fsp.rs             | 530 +++++++++++++++++++++++
drivers/gpu/nova-core/gpu.rs             |  87 +++-
drivers/gpu/nova-core/gpu/hal.rs         |  41 ++
drivers/gpu/nova-core/gsp/boot.rs        | 292 ++++++++++---
drivers/gpu/nova-core/gsp/commands.rs    |   8 +-
drivers/gpu/nova-core/gsp/fw.rs          |  85 +++-
drivers/gpu/nova-core/gsp/fw/commands.rs |  22 +-
drivers/gpu/nova-core/mctp.rs            | 119 +++++
drivers/gpu/nova-core/nova_core.rs       |   2 +
drivers/gpu/nova-core/regs.rs            |  96 ++++
rust/kernel/ptr.rs                       |  24 +
25 files changed, 1976 insertions(+), 239 deletions(-)
create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
create mode 100644 drivers/gpu/nova-core/fsp.rs
create mode 100644 drivers/gpu/nova-core/gpu/hal.rs
create mode 100644 drivers/gpu/nova-core/mctp.rs
[PATCH v9 00/31] gpu: nova-core: firmware: Hopper/Blackwell support
Posted by John Hubbard 6 days, 22 hours ago
This is based on today's drm-rust-next. A git branch is here:

    https://github.com/johnhubbard/linux/tree/nova-core-blackwell-v9

It's been re-tested on Turing and Blackwell:

    NovaCore 0000:e1:00.0: GPU name: NVIDIA GeForce GTX 1650
    NovaCore 0000:01:00.0: GPU name: NVIDIA RTX PRO 6000 Blackwell Max-Q
    Workstation Edition

Changes in v9:

* Rebased onto today's drm-rust-next.

* Split Architecture::Blackwell into BlackwellGB10x and BlackwellGB20x,
  after Gary Guo and Sashiko pointed out that GB10x and GB20x are
  distinct enough to warrant separate architecture variants. This
  surfaced several bugs where all Blackwell chips were incorrectly
  treated as a single group:
  * Fixed the FSP boot completion register address for GB10x. GB10x
    uses the same address as Hopper (0x000200bc), not the GB20x
    address (0x00ad00bc).
  * Made the FSP secure boot timeout architecture-dependent. GB20x
    now gets 5000ms while Hopper and GB10x keep 4000ms.
  * Removed chipset-level match arms that were working around the
    single-variant design in fb/hal.rs, firmware/gsp.rs, and regs.rs.

* Simplified find_gsp_sigs_section() to return &'static str instead of
  Option<&'static str>, since the Architecture enum is now exhaustive
  and every variant has a known signature section name.

* Moved dma_set_mask_and_coherent from probe() into Gpu::new(), with
  the unsafe block narrowed to just that call. Gpu::new() now takes
  pci::Device<device::Core> instead of device::Bound to support this.

* Dropped the local `chipset` variable in Gpu::new() and accessed
  spec.chipset() directly, since Spec is now Copy.

* Changed Spec::chipset() to take self instead of &self, since Spec is
  Copy.

* Removed the unnecessary Tu102/Gh100 consts in gpu/hal.rs and used the
  unit structs directly.

* Kept a hold on the Firmware object in FspFirmware instead of copying
  the FMC ELF into a KVec<u8>.

* Moved the dev_info formatting fix and the GFW_BOOT comment removal
  out of the Copy/Clone patch and into the patches that actually touch
  those lines.

* Added Reviewed-by tags from Gary Guo and Alice Ryhl.

Changes in v8:

* Added Clone/Copy derives to Spec and Revision. Removed the
  unnecessary pin_init_scope wrapping in Gpu::new() that the lack of
  Copy had forced. Added a Spec::chipset() accessor.

* Removed implementation-detail sentence from the
  Architecture::dma_mask() doccomment.

* Simplified the GPU HAL to two variants (Tu102, Gh100) instead of
  four. Renamed "Fsp" to "Gh100" to follow the HAL naming convention.
  Removed the spurious GA100 special case. Moved the GFW_BOOT wait into
  the HAL method itself instead of returning a bool.

* Increased the GFW_BOOT wait timeout from 4 seconds to 30 seconds,
  after Joel found that a different Blackwell SKU required extra time.

* Removed stray Cc lines from each patch.

* Fixed rustfmt issues in gsp/fw.rs and gsp/boot.rs reported by the
  kernel test robot against v7 patches 27 and 31.

Changes in v7:
* Rebased onto Alexandre Courbot's rust register!() series in
  drm-rust-next, including the related generic I/O accessor and
  IoCapable changes.

* Rebased onto drm-rust-next (v7.0-rc4 based).

* Dropped the v6 patches that are already in drm-rust-next: the
  aux-device fix, the pdev helper macro patch, and the one-item-per-line
  use cleanup.

* Reworked the GPU init pieces per review. DMA mask setup now stays in
  driver probe, with the mask width selected by GPU architecture, and
  the GFW boot policy now lives in a dedicated GPU HAL.

* Reworked firmware image parsing per review around a single ElfFormat
  trait with associated header types. Also added support for both ELF32
  and ELF64 images, with automatic format detection.

* Reworked the MCTP/NVDM protocol code to use bitfield! and typed
  accessors, removing the open-coded bit handling.

* Reworked the FSP messaging part of the series so that the message
  structures are introduced in the first patches that use them, instead
  of as a standalone dead-code-only patch. Also changed fmc_full to use
  KVec<u8> from the start.

* Split the WPR heap overflow handling out into a separate prep patch.
  That patch makes management_overhead() and wpr_heap_size() fallible,
  uses checked arithmetic, and leaves the larger WPR2 heap patch with
  only the Hopper and Blackwell sizing changes.

* Added a code comment documenting the Hopper and Blackwell PCI config
  mirror base change.

Changes in v6:

* Rebased onto drm-rust-next (v7.0-rc1 based).

* Dropped the first two patches from v5 (aux device fix and pdev
  macros), which have since been merged independently.

* const_align_up(): reworked per review from Gary Guo, Miguel Ojeda,
  and Danilo Krummrich: now returns Option<usize> instead of panicking,
  takes an Alignment argument instead of a const generic, and no longer
  needs the inline_const feature addition in scripts/Makefile.build.

* The rust/sizes and SZ_*_U64 patches from v5 are no longer included.
  I plan to post those as a separate series that depends on this one.

Changes in v5:

* Rebased onto linux.git master.

* Split MCTP protocol into its own module and file.

* Many Rust-based improvements: more use of types, especially. Also
  used Result and Option more.

* Lots of cleanup of comments and print output and error handling.

* Added const_align_up() to rust/ and used it in nova-core. This
  required enabling a Rust feature: inline_const, as recommended by
  Miguel Ojeda.

* Refactoring various things, such as Gpu::new() to own Spec creation,
  and several more such things.

* Fixed three Delta::ZERO busy-polls (patches 21, 24, 31) to use
  non-zero sleep intervals (after just realizing that it was a bad
  choice to have zero in there).

* Reduced GH100/GB100 HAL duplication. Made FSP_PKEY_SIZE/FSP_SIG_SIZE
  consistent across patches. Replaced fragile architecture checks with
  chipset.arch(). Renamed LIBOS_BLACKWELL.

* Narrowed the scope of some of the #![expect(dead_code)] cases,
  although that really only matters within the series, not once it is
  fully applied.

John Hubbard (31):
  gpu: nova-core: Hopper/Blackwell: basic GPU identification
  gpu: nova-core: factor .fwsignature* selection into a new
    find_gsp_sigs_section()
  gpu: nova-core: use GPU Architecture to simplify HAL selections
  gpu: nova-core: add Copy/Clone to Spec and Revision, add chipset()
    accessor
  gpu: nova-core: set DMA mask width based on GPU architecture
  gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  gpu: nova-core: move firmware image parsing code to firmware.rs
  gpu: nova-core: factor out an elf_str() function
  gpu: nova-core: don't assume 64-bit firmware images
  gpu: nova-core: add support for 32-bit firmware images
  gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support
    of FSP
  gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  rust: ptr: add const_align_up()
  gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  gpu: nova-core: add MCTP/NVDM protocol types for firmware
    communication
  gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
    waiting
  gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
  gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  gpu: nova-core: Blackwell: use correct sysmem flush registers
  gpu: nova-core: make WPR heap sizing fallible
  gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  gpu: nova-core: refactor SEC2 booter loading into
    BooterFirmware::run()
  gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot()

 drivers/gpu/nova-core/driver.rs          |  24 +-
 drivers/gpu/nova-core/falcon.rs          |   1 +
 drivers/gpu/nova-core/falcon/fsp.rs      | 220 ++++++++++
 drivers/gpu/nova-core/falcon/hal.rs      |  21 +-
 drivers/gpu/nova-core/fb.rs              |  26 +-
 drivers/gpu/nova-core/fb/hal.rs          |  36 +-
 drivers/gpu/nova-core/fb/hal/ga102.rs    |   2 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs    |  75 ++++
 drivers/gpu/nova-core/fb/hal/gb202.rs    |  62 +++
 drivers/gpu/nova-core/fb/hal/gh100.rs    |  38 ++
 drivers/gpu/nova-core/firmware.rs        | 204 +++++++++
 drivers/gpu/nova-core/firmware/booter.rs |  30 ++
 drivers/gpu/nova-core/firmware/fsp.rs    |  44 ++
 drivers/gpu/nova-core/firmware/gsp.rs    | 126 ++----
 drivers/gpu/nova-core/fsp.rs             | 530 +++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs             |  87 +++-
 drivers/gpu/nova-core/gpu/hal.rs         |  41 ++
 drivers/gpu/nova-core/gsp/boot.rs        | 292 ++++++++++---
 drivers/gpu/nova-core/gsp/commands.rs    |   8 +-
 drivers/gpu/nova-core/gsp/fw.rs          |  85 +++-
 drivers/gpu/nova-core/gsp/fw/commands.rs |  22 +-
 drivers/gpu/nova-core/mctp.rs            | 119 +++++
 drivers/gpu/nova-core/nova_core.rs       |   2 +
 drivers/gpu/nova-core/regs.rs            |  96 ++++
 rust/kernel/ptr.rs                       |  24 +
 25 files changed, 1976 insertions(+), 239 deletions(-)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fsp.rs
 create mode 100644 drivers/gpu/nova-core/gpu/hal.rs
 create mode 100644 drivers/gpu/nova-core/mctp.rs


base-commit: dff8302ca1d0e773c90dbeeb05e759f995c95482
-- 
2.53.0
Re: [PATCH v9 00/31] gpu: nova-core: firmware: Hopper/Blackwell support
Posted by Alexandre Courbot 2 days, 19 hours ago
>   gpu: nova-core: move firmware image parsing code to firmware.rs
>   gpu: nova-core: factor out an elf_str() function

These two commits pushed to drm-rust-next, thanks!

The rest will need to wait until `drm-rust-next` reopens as -rc6 has
just been tagged.
Re: [PATCH v9 00/31] gpu: nova-core: firmware: Hopper/Blackwell support
Posted by John Hubbard 2 days, 1 hour ago
On 3/29/26 10:10 PM, Alexandre Courbot wrote:
>>   gpu: nova-core: move firmware image parsing code to firmware.rs
>>   gpu: nova-core: factor out an elf_str() function
> 
> These two commits pushed to drm-rust-next, thanks!
> 
> The rest will need to wait until `drm-rust-next` reopens as -rc6 has
> just been tagged.

Sounds good!

thanks,
-- 
John Hubbard