[v4] gpu: nova-core: firmware: Hopper/Blackwell support

[PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support
Posted by John Hubbard 18 hours ago
Hi,

This is based on the Feb 5, 2026 linux-next: commit 9845cf73f7db ("Add
linux-next specific files for 20260205") That's new enough to have the
pdev.as_ref() changes (see below for details), but not so new as to
include the current merge window churn for Linux .70.

I've re-tested on Ampere (GA104) and Blackwell (GB202) RTX GPUs.

Data center GPUs remain as TODO items: GA100 needs some additional code,
Hopper/GH100 might work but is not yet tested, and I haven't even
thought about Blackwell data center GPUs.

So, even though many patches say Hopper/Blackwell, there may be some
test-and-fix work remaining there.

Changes in v4:

* Fixed the IOMMU page faults on address 0x0 that I was seeing on v3 and
  earlier, for the iommu enabled case. These were due to the sysmem
  flush buffer being in a different location for Blackwell, so I've
  HAL-ified that aspect.

* Added a patch (0001) to pass pdev directly to dev_* logging macros.
  Then converted the remaining patches to also use pdev directly,
  instead of pdev.as_ref(). This is only possible in branches that have
  commit a38cd1fea989 ("rust: device: support `dev_printk` on all
  devices"), which in turn is why this v4 is based on a linux-next
  commit.

* Changed FmcSignatures fields from [u32; N] to [u8; N] arrays because
  the data is not treated as 32-bit integers. This eliminates the need
  for .as_bytes_mut() in the FMC signature extraction patch and allows
  using named constants like [u8; FSP_HASH_SIZE]. (From Timur Tabi's
  review.)

* Changed .unwrap_or(u64::MAX) to .expect("...") for alignment overflow
  in client_alloc_size() and management_overhead(). A panic is warranted
  here since the values are compile-time constants and overflow is
  impossible. (From Timur Tabi's review.)

* Added a patch at the end that I actually expect will get merged
  earlier, separately. But for now, it avoids nova-drm aux bus
  registration failure on multi-GPU systems, which in turn keeps the
  driver alive, which in turn avoids a driver teardown missing feature
  (pre-existing), which in turn avoids IOMMU page faults at non-zero
  addresses. whew. :)

Changes in v3:

* Rebased onto linux-next (20260205), which includes several
  rust-for-linux updates that affected nova-core.

* Removed redundant .as_ref() from dev_*!() macro call sites, since the
  dev_printk!() macro now calls .as_ref() internally (Gary Guo's
  "remove redundant .as_ref() for dev_* print" series).

* Added a `use kernel::io::Io` import in regs.rs, needed after the
  upstream separation of generic I/O helpers from the MMIO
  implementation.

Changes in v2:

v2 is here:
    https://lore.kernel.org/20260131005604.454172-1-jhubbard@nvidia.com

* GA100 (an Ampere chip whose firmware boot steps are closer to Turing,
  than to other Amperes) returns ENOTSUPP for now because it is *known*
  to not work yet.

* FSP: use the new Chipset::fsp_cot_version() method instead of a
  hardcoded constant. This fixes a known wrongness on GH100.

* Changed to a HAL approach to handle the slightly different non-WPR
  heap sizes, for Hopper vs. Blackwell.

* Return Option instead of Result from get_gsp_sigs_section() since
  the failure case is simply "not found".

* Return DmaMask directly from dma_mask() instead of returning a bit
  count.

* Change fmc_full from DmaObject to KVec<u8> since it's only used for
  CPU-side signature extraction and is never submitted to hardware
  (only fmc_image is). This eliminates the need for unsafe code and
  the associated SAFETY comment entirely.

* Use as_bytes_mut() instead of unsafe core::slice::from_raw_parts_mut()
  for copying FMC signature data (hash, public_key, signature arrays).

* Refactor wait_for_gsp_lockdown_release() to use early return with ?
  instead of chained .inspect_err().map().and_then() pattern.

* Removed many dev_dbg! statements.

* Use IEC binary prefix "MiB" instead of "MB" for memory size output.
  Also improved display of small sizes (e.g., "24 KiB" instead of
  "0 MB") and fixed a typo ("suprising" -> "surprising").

* Reordered the "skip GFW boot waiting" commit to appear earlier in the
  series.

* Series has been reduced from 31 to 30 patches, because the "needs
  large reserved mem" patch was absorbed into the non-WPR heap size
  patch.

John Hubbard (33):
  gpu: nova-core: pass pdev directly to dev_* logging macros
  gpu: nova-core: print FB sizes, along with ranges
  gpu: nova-core: add FbRange.len() and use it in boot.rs
  gpu: nova-core: Hopper/Blackwell: basic GPU identification
  gpu: nova-core: factor .fwsignature* selection into a new
    get_gsp_sigs_section()
  gpu: nova-core: use GPU Architecture to simplify HAL selections
  gpu: nova-core: apply the one "use" item per line policy to
    commands.rs
  gpu: nova-core: set DMA mask width based on GPU architecture
  gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  gpu: nova-core: move firmware image parsing code to firmware.rs
  gpu: nova-core: factor out a section_name_eq() function
  gpu: nova-core: don't assume 64-bit firmware images
  gpu: nova-core: add support for 32-bit firmware images
  gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support
    of FSP
  gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
    waiting
  gpu: nova-core: Hopper/Blackwell: add FSP message structures
  gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  gpu: nova-core: Blackwell: use correct sysmem flush registers
  gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  gpu: nova-core: refactor SEC2 booter loading into run_booter() helper
  gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot path
  gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  gpu: nova-core: clarify the GPU firmware boot steps
  gpu: nova-core: fix aux device registration for multi-GPU systems

 drivers/gpu/nova-core/driver.rs          |  48 +-
 drivers/gpu/nova-core/falcon.rs          |   1 +
 drivers/gpu/nova-core/falcon/fsp.rs      | 160 +++++++
 drivers/gpu/nova-core/falcon/hal.rs      |  20 +-
 drivers/gpu/nova-core/fb.rs              | 118 ++++-
 drivers/gpu/nova-core/fb/hal.rs          |  34 +-
 drivers/gpu/nova-core/fb/hal/ga102.rs    |   2 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs    |  73 +++
 drivers/gpu/nova-core/fb/hal/gb202.rs    |  62 +++
 drivers/gpu/nova-core/fb/hal/gh100.rs    |  37 ++
 drivers/gpu/nova-core/firmware.rs        | 186 ++++++++
 drivers/gpu/nova-core/firmware/fsp.rs    |  47 ++
 drivers/gpu/nova-core/firmware/gsp.rs    | 140 ++----
 drivers/gpu/nova-core/fsp.rs             | 561 +++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs             |  87 +++-
 drivers/gpu/nova-core/gsp/boot.rs        | 337 +++++++++++---
 drivers/gpu/nova-core/gsp/commands.rs    |   8 +-
 drivers/gpu/nova-core/gsp/fw.rs          |  63 ++-
 drivers/gpu/nova-core/gsp/fw/commands.rs |  32 +-
 drivers/gpu/nova-core/nova_core.rs       |   1 +
 drivers/gpu/nova-core/num.rs             |  10 +
 drivers/gpu/nova-core/regs.rs            |  95 ++++
 22 files changed, 1856 insertions(+), 266 deletions(-)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fsp.rs


base-commit: 9845cf73f7db6094c0d8419d6adb848028f4a921
-- 
2.53.0