[PATCH v5 00/12] gpu: nova-core: process and prepare more firmwares to boot GSP

Alexandre Courbot posted 12 patches 3 weeks ago
There is a newer version of this series
Documentation/gpu/nova/core/todo.rst              |  17 -
drivers/gpu/nova-core/driver.rs                   |   9 +-
drivers/gpu/nova-core/falcon.rs                   |   6 +-
drivers/gpu/nova-core/falcon/hal.rs               |   2 +-
drivers/gpu/nova-core/fb.rs                       |  65 +++-
drivers/gpu/nova-core/firmware.rs                 | 107 ++++--
drivers/gpu/nova-core/firmware/booter.rs          | 375 ++++++++++++++++++++++
drivers/gpu/nova-core/firmware/gsp.rs             | 239 ++++++++++++++
drivers/gpu/nova-core/firmware/riscv.rs           |  89 +++++
drivers/gpu/nova-core/gpu.rs                      | 143 ++++++---
drivers/gpu/nova-core/gsp.rs                      |  11 +
drivers/gpu/nova-core/gsp/fw.rs                   | 101 ++++++
drivers/gpu/nova-core/gsp/fw/r570_144.rs          |  28 ++
drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 126 ++++++++
drivers/gpu/nova-core/nova_core.rs                |   1 +
drivers/gpu/nova-core/util.rs                     |  20 --
16 files changed, 1216 insertions(+), 123 deletions(-)
[PATCH v5 00/12] gpu: nova-core: process and prepare more firmwares to boot GSP
Posted by Alexandre Courbot 3 weeks ago
This series makes more progress on the GSP boot process for Ampere GPUs.

At the end of the previous series [1], we were left with a WPR memory
region created by the FRTS firmware, but still far from the GSP running.
This series brings us closer to that goal by preparing 2 new firmware
packages:

* The Booter firmware, which the driver loads and runs on the SEC2
  falcon;
* The GSP bootloader and firmware, with the bootloader loaded by Booter onto the GSP RISC-V
  core to verify and load the actual GSP firmware.

There firmwares are involved in a rather complex dance that is made
necessary by limitations related to the level of privilege required to
load code onto the GSP (doing so requires a Heavy Secured signed
firmware, which is the role fulfilled by Booter).

The first 6 patches perform some cleanup and preparatory work for the
remainder of the series. Notably, the GSP boot is moved to a new method
of `Gpu` to get ready for the additional steps that will be added to it,
and the `Gpu` object is now fully built in-place. We also simplify
chipset name generation a bit and move the code requesting a firmware
file into a dedicated function in prevision of the removal of the
`Firmware` structure.

Patch 7 parses the Booter firmware file, queries the hardware for the
right signature to use, and patch it into the firmware blob. This makes
Booter ready to load and run.

Patches 8 and 9 prepare the GSP firmware and its bootloader, and keep
them into a single structure as they are closely related.

Patches 10 and 11 switch to the 570.144 firmware and introduce the
layout for its bindings. The raw bindings are stored in a private
module, and abstracted by the parent module as needed. This allows
consumer modules to access a safer/nicer form of the bindings than the
raw one, and also makes it easier to switch to a different (and
potentially incompatible) firmware version in the future.

570.144 has been picked because it is the latest firmware currently in
linux-firmware, but as development progresses the plan is to switch to a
newer one that is designed with the constraint of upstream in mind. So
support for 570.144 will be dropped in the future. Support for multiple
firmware versions is not considered at the moment: there is no immediate
need for it as the driver is still unstable, and we can think about this
point once we approach stability (and have better visibility about the
shape of the firmware at that point).

The last patch introduces the first bindings and uses them to compute
more framebuffer layout information needed for booting the GSP. A
separate patch series will pick up from there to use this information
and finally run these firmware blobs.

The base of this series is today's drm-rust-next, with a couple more
dependencies required:

- The pin-init patch allowing references to previously initialized
  fields [2],
- For the last patch, the Alignment series [3],
- The following diff to make the aforementioned pin-init patch build:

--- a/rust/kernel/devres.rs
+++ b/rust/kernel/devres.rs
@@ -138,7 +138,6 @@ pub fn new<'a, E>(

         try_pin_init!(&this in Self {
             dev: dev.into(),
-            callback,
             // INVARIANT: `inner` is properly initialized.
             inner <- {
                 // SAFETY: `this` is a valid pointer to uninitialized memory.
@@ -160,6 +159,7 @@ pub fn new<'a, E>(
                     data <- Revocable::new(data),
                 }))
             },
+            callback,
         })
     }

A tree with all these dependencies and the patches of this series is
available at [4].

[1] https://lore.kernel.org/rust-for-linux/20250619-nova-frts-v6-0-ecf41ef99252@nvidia.com/
[2] https://lore.kernel.org/rust-for-linux/20250905140047.3325945-1-lossin@kernel.org/
[3] https://lore.kernel.org/rust-for-linux/20250908-num-v5-0-c0f2f681ea96@nvidia.com/
[4] https://github.com/Gnurou/linux/tree/b4/nova_firmware

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
Changes in v5:
- Perform construction of `Gpu` object in-place.
- Link to v4: https://lore.kernel.org/r/20250909-nova_firmware-v4-0-4dcb433d5fee@nvidia.com

Changes in v4:
- Rebase on top of latest Alignment series.
- Make use of pin-init references to initialized fields.
- Remove all uses of `unsafe` except for `FromBytes` and `AsBytes`
  implementations.
- Keep the GSP placeholder inside the `Gpu` struct.
- Move GSP firmware bindings under the `gsp` module.
- Get the firmware-specific information from the bindings instead of a
  HAL.
- Link to v3: https://lore.kernel.org/r/20250902-nova_firmware-v3-0-56854d9c5398@nvidia.com

Changes in v3:
- Move the GSP boot process out of the Gpu constructor.
- Get rid of the `Firmware` struct and discard loaded firmware blobs
  after the GSP is booted.
- Consolidate the GSP firmware, bootloader and signatures into a single
  type.
- Make firmware bindings completely opaque to the driver.
- Improve firmware abstractions related to framebuffer carveout.
- Improve comments and naming throughout the series. (thanks John!)
- Use alias for bindings module in `nvfw` to avoid repeated version
  numbers everywhere. (thanks John!)
- Fix inconsistency in naming of members of Booter header. (thanks
  Timur!)
- Link to v2: https://lore.kernel.org/r/20250826-nova_firmware-v2-0-93566252fe3a@nvidia.com

Changes in v2:
- Add some GSP bindings and use them to compute more FB layout info
  needed to boot GSP,
- Use PinInit in GspFirmware to avoid several heap allocations,
- Rename `bootloader` to `gsp_bootloader` in `Firmware` to avoid
  confusion with the future Turing falcon bootloader,
- Link to v1: https://lore.kernel.org/r/20250822-nova_firmware-v1-0-ff5633679460@nvidia.com

---
Alexandre Courbot (11):
      gpu: nova-core: require `Send` on `FalconEngine` and `FalconHal`
      gpu: nova-core: move GSP boot code to a dedicated method
      gpu: nova-core: initialize Gpu structure fully in-place
      gpu: nova-core: add Chipset::name() method
      gpu: nova-core: firmware: move firmware request code into a function
      gpu: nova-core: firmware: add support for common firmware header
      gpu: nova-core: firmware: process Booter and patch its signature
      gpu: nova-core: firmware: process and prepare the GSP firmware
      gpu: nova-core: firmware: process the GSP bootloader
      gpu: nova-core: firmware: use 570.144 firmware
      gpu: nova-core: compute layout of more framebuffer regions required for GSP

Alistair Popple (1):
      gpu: nova-core: Add base files for r570.144 firmware bindings

 Documentation/gpu/nova/core/todo.rst              |  17 -
 drivers/gpu/nova-core/driver.rs                   |   9 +-
 drivers/gpu/nova-core/falcon.rs                   |   6 +-
 drivers/gpu/nova-core/falcon/hal.rs               |   2 +-
 drivers/gpu/nova-core/fb.rs                       |  65 +++-
 drivers/gpu/nova-core/firmware.rs                 | 107 ++++--
 drivers/gpu/nova-core/firmware/booter.rs          | 375 ++++++++++++++++++++++
 drivers/gpu/nova-core/firmware/gsp.rs             | 239 ++++++++++++++
 drivers/gpu/nova-core/firmware/riscv.rs           |  89 +++++
 drivers/gpu/nova-core/gpu.rs                      | 143 ++++++---
 drivers/gpu/nova-core/gsp.rs                      |  11 +
 drivers/gpu/nova-core/gsp/fw.rs                   | 101 ++++++
 drivers/gpu/nova-core/gsp/fw/r570_144.rs          |  28 ++
 drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 126 ++++++++
 drivers/gpu/nova-core/nova_core.rs                |   1 +
 drivers/gpu/nova-core/util.rs                     |  20 --
 16 files changed, 1216 insertions(+), 123 deletions(-)
---
base-commit: e2580413a83680f679904ad2f2c1aa6969876469
change-id: 20250822-nova_firmware-e0ffb492ba35

Best regards,
-- 
Alexandre Courbot <acourbot@nvidia.com>
Re: [PATCH v5 00/12] gpu: nova-core: process and prepare more firmwares to boot GSP
Posted by Danilo Krummrich 3 weeks ago
On 9/11/25 1:04 PM, Alexandre Courbot wrote:
> This series makes more progress on the GSP boot process for Ampere GPUs.

With the changes mentioned on the individual patches, the series is:

Acked-by: Danilo Krummrich <dakr@kernel.org>
[PATCH v5 01/12] gpu: nova-core: require `Send` on `FalconEngine` and `FalconHal`
Posted by Alexandre Courbot 3 weeks ago
We want to store the GSP and SEC2 falcon instances inside the `Gpu`
structure, but doing so require these types to implement `Send` for
`pci::Driver` to remain implementable on `NovaCore`, which embeds `Gpu`.

All implementors of `FalconEngine` and `FalconHal` satisfy the
requirements of `Send`, and these traits also already required `Sync`,
so this a minor tweak.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/falcon.rs     | 2 +-
 drivers/gpu/nova-core/falcon/hal.rs | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index 2dbcdf26697beb7e52083675fc9ea62a6167fef8..b16207e7242fe1ac26b8552575b8b4e52f34cf6a 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -286,7 +286,7 @@ fn from(value: bool) -> Self {
 /// Each engine provides one base for `PFALCON` and `PFALCON2` registers. The `ID` constant is used
 /// to identify a given Falcon instance with register I/O methods.
 pub(crate) trait FalconEngine:
-    Sync + RegisterBase<PFalconBase> + RegisterBase<PFalcon2Base> + Sized
+    Send + Sync + RegisterBase<PFalconBase> + RegisterBase<PFalcon2Base> + Sized
 {
     /// Singleton of the engine, used to identify it with register I/O methods.
     const ID: Self;
diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
index b233bc365882f9add9b6eab33b8d462d7913df37..bba28845561795914e9a4dba277d72bbac0b37dd 100644
--- a/drivers/gpu/nova-core/falcon/hal.rs
+++ b/drivers/gpu/nova-core/falcon/hal.rs
@@ -13,7 +13,7 @@
 /// Implements chipset-specific low-level operations. The trait is generic against [`FalconEngine`]
 /// so its `BASE` parameter can be used in order to avoid runtime bound checks when accessing
 /// registers.
-pub(crate) trait FalconHal<E: FalconEngine>: Sync {
+pub(crate) trait FalconHal<E: FalconEngine>: Send + Sync {
     /// Activates the Falcon core if the engine is a risvc/falcon dual engine.
     fn select_core(&self, _falcon: &Falcon<E>, _bar: &Bar0) -> Result {
         Ok(())

-- 
2.51.0
[PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Alexandre Courbot 3 weeks ago
Right now the GSP boot code is very incomplete and limited to running
FRTS, so having it in `Gpu::new` is not a big constraint.

However, this will change as we add more steps of the GSP boot process,
and not all GPU families follow the same procedure, so having these
steps in a dedicated method is the logical construct.

Relatedly, booting the GSP typically happens only once in the GPU reset
cycle. Most of the data created to complete this step (notably firmware
loaded from user-space) is needed only temporarily and can be discarded
once the GSP is booted; it then makes all the more sense to store these
as local variables of a dedicated method, instead of inside the `Gpu`
structure where they are kept as long as the GPU is bound, using dozens
of megabytes of host memory.

Thus, introduce a `start_gsp` static method on the `Gpu` struct, which
is called by `Gpu::new` to initialize the GSP and obtain its runtime
data. The GSP runtime data is currently an empty placeholder, but this
will change in a subsequent patch.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs | 43 +++++++++++++++++++++++++++++++++----------
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index a0878ecdc18bc9e9d975b9ab9c85dd7ab9c3d995..c8f876698b2e5da1d4250af377163a3f07a5ded0 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -172,14 +172,13 @@ pub(crate) struct Gpu {
     /// System memory page required for flushing all pending GPU-side memory writes done through
     /// PCIE into system memory, via sysmembar (A GPU-initiated HW memory-barrier operation).
     sysmem_flush: SysmemFlush,
+    /// GSP runtime data - temporarily a empty placeholder.
+    gsp: (),
 }
 
 impl Gpu {
     /// Helper function to load and run the FWSEC-FRTS firmware and confirm that it has properly
     /// created the WPR2 region.
-    ///
-    /// TODO: this needs to be moved into a larger type responsible for booting the whole GSP
-    /// (`GspBooter`?).
     fn run_fwsec_frts(
         dev: &device::Device<device::Bound>,
         falcon: &Falcon<Gsp>,
@@ -254,6 +253,33 @@ fn run_fwsec_frts(
         }
     }
 
+    /// Attempt to start the GSP.
+    ///
+    /// This is a GPU-dependent and complex procedure that involves loading firmware files from
+    /// user-space, patching them with signatures, and building firmware-specific intricate data
+    /// structures that the GSP will use at runtime.
+    ///
+    /// Upon return, the GSP is up and running, and its runtime object given as return value.
+    pub(crate) fn start_gsp(
+        pdev: &pci::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        gsp_falcon: &Falcon<Gsp>,
+        _sec2_falcon: &Falcon<Sec2>,
+    ) -> Result<()> {
+        let dev = pdev.as_ref();
+
+        let bios = Vbios::new(dev, bar)?;
+
+        let fb_layout = FbLayout::new(chipset, bar)?;
+        dev_dbg!(dev, "{:#x?}\n", fb_layout);
+
+        Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
+
+        // Return an empty placeholder for now, to be replaced with the GSP runtime data.
+        Ok(())
+    }
+
     pub(crate) fn new(
         pdev: &pci::Device<device::Bound>,
         devres_bar: Arc<Devres<Bar0>>,
@@ -284,20 +310,17 @@ pub(crate) fn new(
         )?;
         gsp_falcon.clear_swgen0_intr(bar);
 
-        let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
+        let sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
 
-        let fb_layout = FbLayout::new(spec.chipset, bar)?;
-        dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout);
-
-        let bios = Vbios::new(pdev.as_ref(), bar)?;
-
-        Self::run_fwsec_frts(pdev.as_ref(), &gsp_falcon, bar, &bios, &fb_layout)?;
+        #[allow(clippy::let_unit_value)]
+        let gsp = Self::start_gsp(pdev, bar, spec.chipset, &gsp_falcon, &sec2_falcon)?;
 
         Ok(pin_init!(Self {
             spec,
             bar: devres_bar,
             fw,
             sysmem_flush,
+            gsp,
         }))
     }
 

-- 
2.51.0
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Danilo Krummrich 3 weeks ago
On 9/11/25 1:04 PM, Alexandre Courbot wrote:
> +    /// Attempt to start the GSP.
> +    ///
> +    /// This is a GPU-dependent and complex procedure that involves loading firmware files from
> +    /// user-space, patching them with signatures, and building firmware-specific intricate data
> +    /// structures that the GSP will use at runtime.
> +    ///
> +    /// Upon return, the GSP is up and running, and its runtime object given as return value.
> +    pub(crate) fn start_gsp(
> +        pdev: &pci::Device<device::Bound>,
> +        bar: &Bar0,
> +        chipset: Chipset,
> +        gsp_falcon: &Falcon<Gsp>,
> +        _sec2_falcon: &Falcon<Sec2>,
> +    ) -> Result<()> {> +        let dev = pdev.as_ref();
> +
> +        let bios = Vbios::new(dev, bar)?;
> +
> +        let fb_layout = FbLayout::new(chipset, bar)?;
> +        dev_dbg!(dev, "{:#x?}\n", fb_layout);
> +
> +        Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
> +
> +        // Return an empty placeholder for now, to be replaced with the GSP runtime data.
> +        Ok(())
> +    }

I'd rather create the Gsp structure already, move the code to Gsp::new() and
return an impl PinInit<Self, Error>. If you don't want to store any of the
object instances you create above yet, you can just stuff all the code into an
initializer code block, as you do in the next patch with
gfw::wait_gfw_boot_completion().
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Alexandre Courbot 3 weeks ago
On Thu Sep 11, 2025 at 8:22 PM JST, Danilo Krummrich wrote:
> On 9/11/25 1:04 PM, Alexandre Courbot wrote:
>> +    /// Attempt to start the GSP.
>> +    ///
>> +    /// This is a GPU-dependent and complex procedure that involves loading firmware files from
>> +    /// user-space, patching them with signatures, and building firmware-specific intricate data
>> +    /// structures that the GSP will use at runtime.
>> +    ///
>> +    /// Upon return, the GSP is up and running, and its runtime object given as return value.
>> +    pub(crate) fn start_gsp(
>> +        pdev: &pci::Device<device::Bound>,
>> +        bar: &Bar0,
>> +        chipset: Chipset,
>> +        gsp_falcon: &Falcon<Gsp>,
>> +        _sec2_falcon: &Falcon<Sec2>,
>> +    ) -> Result<()> {> +        let dev = pdev.as_ref();
>> +
>> +        let bios = Vbios::new(dev, bar)?;
>> +
>> +        let fb_layout = FbLayout::new(chipset, bar)?;
>> +        dev_dbg!(dev, "{:#x?}\n", fb_layout);
>> +
>> +        Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
>> +
>> +        // Return an empty placeholder for now, to be replaced with the GSP runtime data.
>> +        Ok(())
>> +    }
>
> I'd rather create the Gsp structure already, move the code to Gsp::new() and
> return an impl PinInit<Self, Error>. If you don't want to store any of the
> object instances you create above yet, you can just stuff all the code into an
> initializer code block, as you do in the next patch with
> gfw::wait_gfw_boot_completion().

I don't think that would work, or be any better even if it did. The full
GSP initialization is pretty complex and all we need to return is one
object created at the beginning that doesn't need to be pinned.
Moreover, the process is also dependent on the GPU family and completely
different on Hopper/Blackwell.

You can see the whole process on [1]. `libos` is the object that is
returned (although its name and type will change). All the rest it
loading, preparing and running firmware, and that is done on the GPU. I
think it would be very out of place in the GSP module.

It is also very step-by-step: run this firmware, wait for it to
complete, run another one, wait for a specific message from the GSP, run
the sequencer, etc. And most of this stuff is thrown away once the GSP
is running. That's where the limits of what we can do with `pin_init!`
are reached, and the GSP object doesn't need to be pinned anyway.

By keeping the initialization in the GPU, we can keep the GSP object
architecture-independent, and I think it makes sense from a design point
of view. That's not to say this code should be in `gpu.rs`, maybe we
want to move it to a GPU HAL, or if we really want this as part of the
GSP a `gsp/boot` module supporting all the different archs. But I'd
prefer to think about this when we start supporting several
architectures.

[1] https://github.com/Gnurou/linux/blob/gsp_init_rebase/drivers/gpu/nova-core/gpu.rs#L305
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Danilo Krummrich 3 weeks ago
On 9/11/25 2:17 PM, Alexandre Courbot wrote:
> On Thu Sep 11, 2025 at 8:22 PM JST, Danilo Krummrich wrote:
>> On 9/11/25 1:04 PM, Alexandre Courbot wrote:
>>> +    /// Attempt to start the GSP.
>>> +    ///
>>> +    /// This is a GPU-dependent and complex procedure that involves loading firmware files from
>>> +    /// user-space, patching them with signatures, and building firmware-specific intricate data
>>> +    /// structures that the GSP will use at runtime.
>>> +    ///
>>> +    /// Upon return, the GSP is up and running, and its runtime object given as return value.
>>> +    pub(crate) fn start_gsp(
>>> +        pdev: &pci::Device<device::Bound>,
>>> +        bar: &Bar0,
>>> +        chipset: Chipset,
>>> +        gsp_falcon: &Falcon<Gsp>,
>>> +        _sec2_falcon: &Falcon<Sec2>,
>>> +    ) -> Result<()> {> +        let dev = pdev.as_ref();
>>> +
>>> +        let bios = Vbios::new(dev, bar)?;
>>> +
>>> +        let fb_layout = FbLayout::new(chipset, bar)?;
>>> +        dev_dbg!(dev, "{:#x?}\n", fb_layout);
>>> +
>>> +        Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
>>> +
>>> +        // Return an empty placeholder for now, to be replaced with the GSP runtime data.
>>> +        Ok(())
>>> +    }
>>
>> I'd rather create the Gsp structure already, move the code to Gsp::new() and
>> return an impl PinInit<Self, Error>. If you don't want to store any of the
>> object instances you create above yet, you can just stuff all the code into an
>> initializer code block, as you do in the next patch with
>> gfw::wait_gfw_boot_completion().
> 
> I don't think that would work, or be any better even if it did. The full
> GSP initialization is pretty complex and all we need to return is one
> object created at the beginning that doesn't need to be pinned.
> Moreover, the process is also dependent on the GPU family and completely
> different on Hopper/Blackwell.

Why would it not work? There is no difference between the code above being
executed from an initializer block or directly in Gsp::new().
> You can see the whole process on [1]. `libos` is the object that is
> returned (although its name and type will change). All the rest it
> loading, preparing and running firmware, and that is done on the GPU. I
> think it would be very out of place in the GSP module.
> 
> It is also very step-by-step: run this firmware, wait for it to
> complete, run another one, wait for a specific message from the GSP, run
> the sequencer, etc. And most of this stuff is thrown away once the GSP
> is running. That's where the limits of what we can do with `pin_init!`
> are reached, and the GSP object doesn't need to be pinned anyway.

I don't see that, in the code you linked you have a bunch of calls that don't
return anything that needs to survive, this can be in an initializer block.

And then you have

let mut libos = gsp::GspMemObjects::new(pdev, bar)?;

which only needs the device reference and the bar reference.

So you can easily write this as:

try_pin_init!(Self {
   _: {
      // all the throw-away stuff from above
   },
   libos <- gsp::GspMemObjects::new(pdev, bar),
   _: {
      libos.do_some_stuff_mutable()?;
   }
})
> By keeping the initialization in the GPU, we can keep the GSP object
> architecture-independent, and I think it makes sense from a design point
> of view. That's not to say this code should be in `gpu.rs`, maybe we
> want to move it to a GPU HAL, or if we really want this as part of the
> GSP a `gsp/boot` module supporting all the different archs. But I'd
> prefer to think about this when we start supporting several
> architectures.

Didn't we talk about a struct Gsp that will eventually be returned by
Self::start_gsp(), or did I make this up in my head?

The way I think about this is that we'll have a struct Gsp that represents the
entry point in the driver to mess with the GSP command queue.

But either way, this throws up two questions, if Self::start_gsp() return a
struct GspMemObjects instead (which is probably the same thing with a different
name), then:

Are we sure this won't need any locks? If it will need locking (which I expect)
then it needs pin-init.

If it never needs pinning why did you write it as

gsp <- Self::start_gsp(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)?,

in a patch 3?
> [1] https://github.com/Gnurou/linux/blob/gsp_init_rebase/drivers/gpu/nova-core/gpu.rs#L305
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Alexandre Courbot 3 weeks ago
On Thu Sep 11, 2025 at 9:46 PM JST, Danilo Krummrich wrote:
> On 9/11/25 2:17 PM, Alexandre Courbot wrote:
>> On Thu Sep 11, 2025 at 8:22 PM JST, Danilo Krummrich wrote:
>>> On 9/11/25 1:04 PM, Alexandre Courbot wrote:
>>>> +    /// Attempt to start the GSP.
>>>> +    ///
>>>> +    /// This is a GPU-dependent and complex procedure that involves loading firmware files from
>>>> +    /// user-space, patching them with signatures, and building firmware-specific intricate data
>>>> +    /// structures that the GSP will use at runtime.
>>>> +    ///
>>>> +    /// Upon return, the GSP is up and running, and its runtime object given as return value.
>>>> +    pub(crate) fn start_gsp(
>>>> +        pdev: &pci::Device<device::Bound>,
>>>> +        bar: &Bar0,
>>>> +        chipset: Chipset,
>>>> +        gsp_falcon: &Falcon<Gsp>,
>>>> +        _sec2_falcon: &Falcon<Sec2>,
>>>> +    ) -> Result<()> {> +        let dev = pdev.as_ref();
>>>> +
>>>> +        let bios = Vbios::new(dev, bar)?;
>>>> +
>>>> +        let fb_layout = FbLayout::new(chipset, bar)?;
>>>> +        dev_dbg!(dev, "{:#x?}\n", fb_layout);
>>>> +
>>>> +        Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
>>>> +
>>>> +        // Return an empty placeholder for now, to be replaced with the GSP runtime data.
>>>> +        Ok(())
>>>> +    }
>>>
>>> I'd rather create the Gsp structure already, move the code to Gsp::new() and
>>> return an impl PinInit<Self, Error>. If you don't want to store any of the
>>> object instances you create above yet, you can just stuff all the code into an
>>> initializer code block, as you do in the next patch with
>>> gfw::wait_gfw_boot_completion().
>> 
>> I don't think that would work, or be any better even if it did. The full
>> GSP initialization is pretty complex and all we need to return is one
>> object created at the beginning that doesn't need to be pinned.
>> Moreover, the process is also dependent on the GPU family and completely
>> different on Hopper/Blackwell.
>
> Why would it not work? There is no difference between the code above being
> executed from an initializer block or directly in Gsp::new().

Yeah, that's pretty much my point. :) Why run it in an initializer if
the result doesn't need to be initialized in-place anyway?

>> You can see the whole process on [1]. `libos` is the object that is
>> returned (although its name and type will change). All the rest it
>> loading, preparing and running firmware, and that is done on the GPU. I
>> think it would be very out of place in the GSP module.
>> 
>> It is also very step-by-step: run this firmware, wait for it to
>> complete, run another one, wait for a specific message from the GSP, run
>> the sequencer, etc. And most of this stuff is thrown away once the GSP
>> is running. That's where the limits of what we can do with `pin_init!`
>> are reached, and the GSP object doesn't need to be pinned anyway.
>
> I don't see that, in the code you linked you have a bunch of calls that don't
> return anything that needs to survive, this can be in an initializer block.
>
> And then you have
>
> let mut libos = gsp::GspMemObjects::new(pdev, bar)?;
>
> which only needs the device reference and the bar reference.
>
> So you can easily write this as:
>
> try_pin_init!(Self {
>    _: {
>       // all the throw-away stuff from above
>    },
>    libos <- gsp::GspMemObjects::new(pdev, bar),
>    _: {
>       libos.do_some_stuff_mutable()?;
>    }
> })

Can the second initializer block access variables created in the first?
I suspect we can also initialize `libos` first, and move everything in a
block, but then my question would be why do we need to jump through that
hoop.

>> By keeping the initialization in the GPU, we can keep the GSP object
>> architecture-independent, and I think it makes sense from a design point
>> of view. That's not to say this code should be in `gpu.rs`, maybe we
>> want to move it to a GPU HAL, or if we really want this as part of the
>> GSP a `gsp/boot` module supporting all the different archs. But I'd
>> prefer to think about this when we start supporting several
>> architectures.
>
> Didn't we talk about a struct Gsp that will eventually be returned by
> Self::start_gsp(), or did I make this up in my head?
>
> The way I think about this is that we'll have a struct Gsp that represents the
> entry point in the driver to mess with the GSP command queue.
>
> But either way, this throws up two questions, if Self::start_gsp() return a
> struct GspMemObjects instead (which is probably the same thing with a different
> name), then:
>
> Are we sure this won't need any locks? If it will need locking (which I expect)
> then it needs pin-init.

Sorry, I have been imprecise: I should I said: "it can be moved" rather
than "it doesn't need to be pinned". In that case I don't think
`Gsp::new` needs to return an `impl PinInit`, right?

>
> If it never needs pinning why did you write it as
>
> gsp <- Self::start_gsp(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)?,
>
> in a patch 3?
>> [1] https://github.com/Gnurou/linux/blob/gsp_init_rebase/drivers/gpu/nova-core/gpu.rs#L305

Ah, I blindly copied that part from your initial suggestion [1] and
forgot to double check that part. We can use `:` here for `gsp`, as the
returned value of `start_gsp` can be moved without any issue. So if we
put it behind a lock at the `Gpu` level, the current pattern should not
be a problem as it can be moved where needed by the `Gpu` initializer.

Now I don't have a precise idea of how we are going to do locking, and
you seem to have given it more thought than I have, so please let me
know if I am still missing something.

[1] https://lore.kernel.org/rust-for-linux/DCOCL398HXDH.3QH9U6UGGIUP1@kernel.org/
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Joel Fernandes 2 weeks, 5 days ago
On Thu, Sep 11, 2025 at 10:26:08PM +0900, Alexandre Courbot wrote:
> On Thu Sep 11, 2025 at 9:46 PM JST, Danilo Krummrich wrote:
[..] 
> >> By keeping the initialization in the GPU, we can keep the GSP object
> >> architecture-independent, and I think it makes sense from a design point
> >> of view. That's not to say this code should be in `gpu.rs`, maybe we
> >> want to move it to a GPU HAL, or if we really want this as part of the
> >> GSP a `gsp/boot` module supporting all the different archs. But I'd
> >> prefer to think about this when we start supporting several
> >> architectures.
> >
> > Didn't we talk about a struct Gsp that will eventually be returned by
> > Self::start_gsp(), or did I make this up in my head?
> >
> > The way I think about this is that we'll have a struct Gsp that represents the
> > entry point in the driver to mess with the GSP command queue.
> >
> > But either way, this throws up two questions, if Self::start_gsp() return a
> > struct GspMemObjects instead (which is probably the same thing with a different
> > name), then:
> >
> > Are we sure this won't need any locks? If it will need locking (which I expect)
> > then it needs pin-init.
> 
> Sorry, I have been imprecise: I should I said: "it can be moved" rather
> than "it doesn't need to be pinned". In that case I don't think
> `Gsp::new` needs to return an `impl PinInit`, right?

If you don't mind clarifying for me, what is the difference between "it
doesn't need to be pinned" and "it can be moved"? AFAICS, they mean the same
thing. If you don't want move semantics on something, the only way to achieve
that is pinning no?. If it can be moved, and it contains locks, then that will
break unless pinned AFAICS. So if need locking in Gsp, which I think we'll
need (to support sychrnoized command queue accesses), then I think pinning is
unavoidable.

On the other hand, if just the firmware loading part is kept separate,
then perhaps that part can remain unpinned?

Any chance we can initialize the locks later? We don't need locking until
after the boot process is completed, and if there's a way we can dynamically
"pin", where we hypothetically pin after the boot process completed, that
might also work. Though I am not sure if that's something possible in
Rust/rust4linux or if it makes sense.

Other thoughts?

thanks,

 - Joel
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Danilo Krummrich 2 weeks, 5 days ago
On Sat Sep 13, 2025 at 3:02 AM CEST, Joel Fernandes wrote:
> Any chance we can initialize the locks later? We don't need locking until
> after the boot process is completed, and if there's a way we can dynamically
> "pin", where we hypothetically pin after the boot process completed, that
> might also work. Though I am not sure if that's something possible in
> Rust/rust4linux or if it makes sense.

We can't partially initialize structures and then rely on accessing initialized
data only. This is one of the sources for memory bugs that Rust tries to solve.

You can wrap fields into Option types and initialize them later, which would
defer pin-init calls for the price of having Option fields around.

However, we should never do such things. If there's the necessity to do
something like that, it indicates a design issue.

In this case, there's no problem, we can use pin-init without any issues right
away, and should do so.

pin-init is going to be an essential part of *every* Rust driver given that a
lot of the C infrastruture that we abstract requires pinned initialization, such
as locks and other synchronization primitives.
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Joel Fernandes 2 weeks, 5 days ago
On Sat, Sep 13, 2025 at 03:30:31PM +0200, Danilo Krummrich wrote:
> On Sat Sep 13, 2025 at 3:02 AM CEST, Joel Fernandes wrote:
> > Any chance we can initialize the locks later? We don't need locking until
> > after the boot process is completed, and if there's a way we can dynamically
> > "pin", where we hypothetically pin after the boot process completed, that
> > might also work. Though I am not sure if that's something possible in
> > Rust/rust4linux or if it makes sense.
> 
> We can't partially initialize structures and then rely on accessing initialized
> data only.

Yet, that is exactly what the pin initialization sequence block does? The
whole structure is not initialized yet you need access to already initialized
fields.

> This is one of the sources for memory bugs that Rust tries to solve.
> You can wrap fields into Option types and initialize them later, which would
> defer pin-init calls for the price of having Option fields around.

I am not denying the need for pinning. Also regarding Option, just thinking
out loud but if something is optional temporary, maybe needing a new type
like TempOption, and promote it to a non-option type later, I am not seeing
that as completely outside the world, if there is a legitimate usecase that
needs to be Option temporarily, but not later, what's wrong with that? It is
"Optional" for the timebeing, but not later.

> However, we should never do such things. If there's the necessity to do
> something like that, it indicates a design issue.
> 
> In this case, there's no problem, we can use pin-init without any issues right
> away, and should do so.
> 
> pin-init is going to be an essential part of *every* Rust driver given that a
> lot of the C infrastruture that we abstract requires pinned initialization, such
> as locks and other synchronization primitives.

To be honest, the pinning concept seems like an after thought for such a
fundamental thing that we need, requiring additional macros, and bandaids on
top of the language itself, to make it work for the kernel. I am not alone in
that opinion. This should be first-class in a (systems) language, built into
the language itself? I am talking about the whole pin initialization,
accessing fields dances, etc.

Also I am concerned that overusage of pinning defeats a lot of optimizations
that Rust may be able to perform, especially forcefully pinning stuff that
does not need all to be pinned (except to satisfy paranoia), thus generating
suboptimal code gen. Not only does it require redesign and concerns over
accesses to un-initialized fields, like we saw in the last 2-3 weeks, it also
forces people into that when maybe there is a chance that underlying
structures do not need to be pinned at all (for some usecases).

These are just my opinions.

thanks,

 - Joel
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Miguel Ojeda 2 weeks, 4 days ago
On Sat, Sep 13, 2025 at 7:14 PM Joel Fernandes <joelagnelf@nvidia.com> wrote:
>
> I am not alone in that opinion.

Hmm... I am not sure how to read this.

> This should be first-class in a (systems) language, built into
> the language itself?

I would suggest taking a look at our website and the links there (like
issue #2) -- what we are doing upstream Rust is documented.

(Danilo gave you a direct link, but I mention it this way because
there are a lot of things going on, and it is worth a look and perhaps
you may find something interesting you could help with).

> except to satisfy paranoia

Using unsafe code everywhere (or introducing unsoundness or UB for
convenience) would defeat much of the Rust for Linux exercise.

Cheers,
Miguel
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by John Hubbard 2 weeks, 4 days ago
On 9/13/25 1:37 PM, Miguel Ojeda wrote:
> On Sat, Sep 13, 2025 at 7:14 PM Joel Fernandes <joelagnelf@nvidia.com> wrote:
>>
>> I am not alone in that opinion.
> 
> Hmm... I am not sure how to read this.
> 
>> This should be first-class in a (systems) language, built into
>> the language itself?

On this particular point, and *only* this point: some time around
mid-2025, I started wondering out loud, "shouldn't Rust have some
built-in understanding, in the language/compiler itself, of the
concept of pinned memory?"

Because, after doing a modest bit of Rust for Linux coding, I was
struck by "Rust is a systems programming langauge", vs. "systems
programming often involves DMA (which generally pins memory)".
And the other observation is that pin-init discussions are some
of the most advanced and exotic in Rust for Linux. These things
don't go together.

So it seemed like this is a lesson that Rust for Linux has learned,
that can be taken back to Rust itself. I recommended this as a
non-urgent Kangrejos topic.

> 
> I would suggest taking a look at our website and the links there (like
> issue #2) -- what we are doing upstream Rust is documented.

...and my question was asked before reading through issue #2. So your
and Danilo's responses seem to be saying that there is already some
understanding that this is an area that could be improved.

Good!

I believe "issue #2" refers to this, right?

    https://github.com/Rust-for-Linux/linux/issues/2

That's going to take some time to figure out if it interects
what I was requesting, but I'll have a go at it.

> 
> (Danilo gave you a direct link, but I mention it this way because
> there are a lot of things going on, and it is worth a look and perhaps
> you may find something interesting you could help with).
> 
>> except to satisfy paranoia
> 
> Using unsafe code everywhere (or introducing unsoundness or UB for
> convenience) would defeat much of the Rust for Linux exercise.
> 

Yes. It's only "paranoia" if the code is bug-free. So Rust itself
naturally will look "a little" paranoid, that's core to its mission. :)


thanks,
-- 
John Hubbard

Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Joel Fernandes 2 weeks, 4 days ago
On Sat, Sep 13, 2025 at 02:29:54PM -0700, John Hubbard wrote:
[..]

> > 
> > I would suggest taking a look at our website and the links there (like
> > issue #2) -- what we are doing upstream Rust is documented.
> 
> ...and my question was asked before reading through issue #2. So your
> and Danilo's responses seem to be saying that there is already some
> understanding that this is an area that could be improved.
> 
> Good!
> 
> I believe "issue #2" refers to this, right?
> 
>    https://github.com/Rust-for-Linux/linux/issues/2
> 
> That's going to take some time to figure out if it interects
> what I was requesting, but I'll have a go at it.

Indeed, kudos to rust-for-linux community for working on missing Rust
features and on pinning itself.

> > 
> > (Danilo gave you a direct link, but I mention it this way because
> > there are a lot of things going on, and it is worth a look and perhaps
> > you may find something interesting you could help with).
> > 
> > > except to satisfy paranoia
> > 
> > Using unsafe code everywhere (or introducing unsoundness or UB for
> > convenience) would defeat much of the Rust for Linux exercise.
> > 
> 
> Yes. It's only "paranoia" if the code is bug-free. So Rust itself
> naturally will look "a little" paranoid, that's core to its mission. :)

This seems to be taken out-of-context, I said "paranoia" mainly because I am
not sure if excessive use of pinning may tend to cause other problems. The
"paranoia" is about over-usage of pinning. Personally, I don't prefer to pin
stuff in my code until I absolutely need to, or when I start having needs for
pinning, like using spinlocks. Maybe I am wrong, but the way I learnt Rust,
data movement is baked into it. I am not yet confident pinning will not
constraint Rust code gen -- but that could just be a part of my learning
journey that I have to convince myself it is Ok to do so in advance of
actually requiring it even if you simply hypothetically anticipate needing it
(as Danilo pointed out that in practice this is not an issue and I do tend to
agree with Miguel and Danilo because they are usually right :-D). I am
researching counter examples :-)

thanks,

 - Joel
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Alexandre Courbot 2 weeks, 4 days ago
On Sun Sep 14, 2025 at 7:06 AM JST, Joel Fernandes wrote:
> On Sat, Sep 13, 2025 at 02:29:54PM -0700, John Hubbard wrote:
> [..]
>
>> > 
>> > I would suggest taking a look at our website and the links there (like
>> > issue #2) -- what we are doing upstream Rust is documented.
>> 
>> ...and my question was asked before reading through issue #2. So your
>> and Danilo's responses seem to be saying that there is already some
>> understanding that this is an area that could be improved.
>> 
>> Good!
>> 
>> I believe "issue #2" refers to this, right?
>> 
>>    https://github.com/Rust-for-Linux/linux/issues/2
>> 
>> That's going to take some time to figure out if it interects
>> what I was requesting, but I'll have a go at it.
>
> Indeed, kudos to rust-for-linux community for working on missing Rust
> features and on pinning itself.
>
>> > 
>> > (Danilo gave you a direct link, but I mention it this way because
>> > there are a lot of things going on, and it is worth a look and perhaps
>> > you may find something interesting you could help with).
>> > 
>> > > except to satisfy paranoia
>> > 
>> > Using unsafe code everywhere (or introducing unsoundness or UB for
>> > convenience) would defeat much of the Rust for Linux exercise.
>> > 
>> 
>> Yes. It's only "paranoia" if the code is bug-free. So Rust itself
>> naturally will look "a little" paranoid, that's core to its mission. :)
>
> This seems to be taken out-of-context, I said "paranoia" mainly because I am
> not sure if excessive use of pinning may tend to cause other problems. The
> "paranoia" is about over-usage of pinning. Personally, I don't prefer to pin
> stuff in my code until I absolutely need to, or when I start having needs for
> pinning, like using spinlocks. Maybe I am wrong, but the way I learnt Rust,
> data movement is baked into it. I am not yet confident pinning will not
> constraint Rust code gen -- but that could just be a part of my learning
> journey that I have to convince myself it is Ok to do so in advance of
> actually requiring it even if you simply hypothetically anticipate needing it
> (as Danilo pointed out that in practice this is not an issue and I do tend to
> agree with Miguel and Danilo because they are usually right :-D). I am
> researching counter examples :-)

You can look at the definition for `Pin` in [1], but it is so short we
can paste it here:

    #[repr(transparent)]
    #[derive(Copy, Clone)]
    pub struct Pin<Ptr> {
        pointer: Ptr,
    }

There isn't much getting in the way of optimized code generation - its
purpose is simply to constraint the acquisition of mutable references to
prevent moving the pointee out.

I started this patchset a little bit skeptical about the need to pin so
many things, but after seeing the recent additions to `pin_init` and
rewriting the code as Danilo suggested, it starteds to click. The
supposed restrictions are in practice avoided by embracing the concept
fully, and in the end I got that feeling (familiar when writing Rust) of
being guided towards the right design - a bit like playing bowling with
gutter guards.

Yes, that means redesigning and rebasing our code, but that's also the
cost of learning a new language.

And yes, things can still be a little bit rough around the edges, but
there is awareness and action taken to address these issues, at the
compiler level when relevant. This makes me confident for the future.

[1] https://doc.rust-lang.org/src/core/pin.rs.html#1094
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Benno Lossin 2 weeks, 4 days ago
On Sun Sep 14, 2025 at 3:49 AM CEST, Alexandre Courbot wrote:
> On Sun Sep 14, 2025 at 7:06 AM JST, Joel Fernandes wrote:
>> On Sat, Sep 13, 2025 at 02:29:54PM -0700, John Hubbard wrote:
>>> Yes. It's only "paranoia" if the code is bug-free. So Rust itself
>>> naturally will look "a little" paranoid, that's core to its mission. :)
>>
>> This seems to be taken out-of-context, I said "paranoia" mainly because I am
>> not sure if excessive use of pinning may tend to cause other problems. The
>> "paranoia" is about over-usage of pinning. Personally, I don't prefer to pin
>> stuff in my code until I absolutely need to, or when I start having needs for
>> pinning, like using spinlocks. Maybe I am wrong, but the way I learnt Rust,
>> data movement is baked into it. I am not yet confident pinning will not
>> constraint Rust code gen -- but that could just be a part of my learning
>> journey that I have to convince myself it is Ok to do so in advance of
>> actually requiring it even if you simply hypothetically anticipate needing it
>> (as Danilo pointed out that in practice this is not an issue and I do tend to
>> agree with Miguel and Danilo because they are usually right :-D). I am
>> researching counter examples :-)
>
> You can look at the definition for `Pin` in [1], but it is so short we
> can paste it here:
>
>     #[repr(transparent)]
>     #[derive(Copy, Clone)]
>     pub struct Pin<Ptr> {
>         pointer: Ptr,
>     }
>
> There isn't much getting in the way of optimized code generation - its
> purpose is simply to constraint the acquisition of mutable references to
> prevent moving the pointee out.
>
> I started this patchset a little bit skeptical about the need to pin so
> many things, but after seeing the recent additions to `pin_init` and
> rewriting the code as Danilo suggested, it starteds to click. The
> supposed restrictions are in practice avoided by embracing the concept
> fully, and in the end I got that feeling (familiar when writing Rust) of
> being guided towards the right design - a bit like playing bowling with
> gutter guards.

That's a great way to put it -- I had a similar experience when writing
pin-init and thinking about it purely theoretically. Good to see that it
works out in practice (and continues to do so :).

> Yes, that means redesigning and rebasing our code, but that's also the
> cost of learning a new language.
>
> And yes, things can still be a little bit rough around the edges, but
> there is awareness and action taken to address these issues, at the
> compiler level when relevant. This makes me confident for the future.

If you have an issue that you cannot work around, or something that
looks off, let me know. Maybe that's something that pin-init can deal
better with or we can have another library that helps with it. After all
pin-init is specially tailored for the kernel to work :)

---
Cheers,
Benno
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Alexandre Courbot 2 weeks, 3 days ago
On Sun Sep 14, 2025 at 11:42 PM JST, Benno Lossin wrote:
> On Sun Sep 14, 2025 at 3:49 AM CEST, Alexandre Courbot wrote:
>> On Sun Sep 14, 2025 at 7:06 AM JST, Joel Fernandes wrote:
>>> On Sat, Sep 13, 2025 at 02:29:54PM -0700, John Hubbard wrote:
>>>> Yes. It's only "paranoia" if the code is bug-free. So Rust itself
>>>> naturally will look "a little" paranoid, that's core to its mission. :)
>>>
>>> This seems to be taken out-of-context, I said "paranoia" mainly because I am
>>> not sure if excessive use of pinning may tend to cause other problems. The
>>> "paranoia" is about over-usage of pinning. Personally, I don't prefer to pin
>>> stuff in my code until I absolutely need to, or when I start having needs for
>>> pinning, like using spinlocks. Maybe I am wrong, but the way I learnt Rust,
>>> data movement is baked into it. I am not yet confident pinning will not
>>> constraint Rust code gen -- but that could just be a part of my learning
>>> journey that I have to convince myself it is Ok to do so in advance of
>>> actually requiring it even if you simply hypothetically anticipate needing it
>>> (as Danilo pointed out that in practice this is not an issue and I do tend to
>>> agree with Miguel and Danilo because they are usually right :-D). I am
>>> researching counter examples :-)
>>
>> You can look at the definition for `Pin` in [1], but it is so short we
>> can paste it here:
>>
>>     #[repr(transparent)]
>>     #[derive(Copy, Clone)]
>>     pub struct Pin<Ptr> {
>>         pointer: Ptr,
>>     }
>>
>> There isn't much getting in the way of optimized code generation - its
>> purpose is simply to constraint the acquisition of mutable references to
>> prevent moving the pointee out.
>>
>> I started this patchset a little bit skeptical about the need to pin so
>> many things, but after seeing the recent additions to `pin_init` and
>> rewriting the code as Danilo suggested, it starteds to click. The
>> supposed restrictions are in practice avoided by embracing the concept
>> fully, and in the end I got that feeling (familiar when writing Rust) of
>> being guided towards the right design - a bit like playing bowling with
>> gutter guards.
>
> That's a great way to put it -- I had a similar experience when writing
> pin-init and thinking about it purely theoretically. Good to see that it
> works out in practice (and continues to do so :).
>
>> Yes, that means redesigning and rebasing our code, but that's also the
>> cost of learning a new language.
>>
>> And yes, things can still be a little bit rough around the edges, but
>> there is awareness and action taken to address these issues, at the
>> compiler level when relevant. This makes me confident for the future.
>
> If you have an issue that you cannot work around, or something that
> looks off, let me know. Maybe that's something that pin-init can deal
> better with or we can have another library that helps with it. After all
> pin-init is specially tailored for the kernel to work :)

I was thinking about the lack of access to already-initialized fields in
the initializer when writing this, which has been covered so thanks for
that. :)

One more thing I still don't know how to do without unsafe code is
accessing structurally-pinned members of a pinned object. I had expected
that projection methods would be generated for such members marked
`#[pin]`, but I haven't found anything yet. For instance, we need to
call mutable methods on a pinned member of a pinned object, and the only
way I have found to do this is to use unsafe code. Is there a better
way?
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Benno Lossin 2 weeks, 3 days ago
On Mon Sep 15, 2025 at 6:59 AM CEST, Alexandre Courbot wrote:
> On Sun Sep 14, 2025 at 11:42 PM JST, Benno Lossin wrote:
>> On Sun Sep 14, 2025 at 3:49 AM CEST, Alexandre Courbot wrote:
>>> On Sun Sep 14, 2025 at 7:06 AM JST, Joel Fernandes wrote:
>>>> On Sat, Sep 13, 2025 at 02:29:54PM -0700, John Hubbard wrote:
>>>>> Yes. It's only "paranoia" if the code is bug-free. So Rust itself
>>>>> naturally will look "a little" paranoid, that's core to its mission. :)
>>>>
>>>> This seems to be taken out-of-context, I said "paranoia" mainly because I am
>>>> not sure if excessive use of pinning may tend to cause other problems. The
>>>> "paranoia" is about over-usage of pinning. Personally, I don't prefer to pin
>>>> stuff in my code until I absolutely need to, or when I start having needs for
>>>> pinning, like using spinlocks. Maybe I am wrong, but the way I learnt Rust,
>>>> data movement is baked into it. I am not yet confident pinning will not
>>>> constraint Rust code gen -- but that could just be a part of my learning
>>>> journey that I have to convince myself it is Ok to do so in advance of
>>>> actually requiring it even if you simply hypothetically anticipate needing it
>>>> (as Danilo pointed out that in practice this is not an issue and I do tend to
>>>> agree with Miguel and Danilo because they are usually right :-D). I am
>>>> researching counter examples :-)
>>>
>>> You can look at the definition for `Pin` in [1], but it is so short we
>>> can paste it here:
>>>
>>>     #[repr(transparent)]
>>>     #[derive(Copy, Clone)]
>>>     pub struct Pin<Ptr> {
>>>         pointer: Ptr,
>>>     }
>>>
>>> There isn't much getting in the way of optimized code generation - its
>>> purpose is simply to constraint the acquisition of mutable references to
>>> prevent moving the pointee out.
>>>
>>> I started this patchset a little bit skeptical about the need to pin so
>>> many things, but after seeing the recent additions to `pin_init` and
>>> rewriting the code as Danilo suggested, it starteds to click. The
>>> supposed restrictions are in practice avoided by embracing the concept
>>> fully, and in the end I got that feeling (familiar when writing Rust) of
>>> being guided towards the right design - a bit like playing bowling with
>>> gutter guards.
>>
>> That's a great way to put it -- I had a similar experience when writing
>> pin-init and thinking about it purely theoretically. Good to see that it
>> works out in practice (and continues to do so :).
>>
>>> Yes, that means redesigning and rebasing our code, but that's also the
>>> cost of learning a new language.
>>>
>>> And yes, things can still be a little bit rough around the edges, but
>>> there is awareness and action taken to address these issues, at the
>>> compiler level when relevant. This makes me confident for the future.
>>
>> If you have an issue that you cannot work around, or something that
>> looks off, let me know. Maybe that's something that pin-init can deal
>> better with or we can have another library that helps with it. After all
>> pin-init is specially tailored for the kernel to work :)
>
> I was thinking about the lack of access to already-initialized fields in
> the initializer when writing this, which has been covered so thanks for
> that. :)
>
> One more thing I still don't know how to do without unsafe code is
> accessing structurally-pinned members of a pinned object. I had expected
> that projection methods would be generated for such members marked
> `#[pin]`, but I haven't found anything yet. For instance, we need to
> call mutable methods on a pinned member of a pinned object, and the only
> way I have found to do this is to use unsafe code. Is there a better
> way?

That also lands together with access to initialized fields :) Danilo
already merged it into drm-rust-next: 619db96daf94 ("rust: pin-init: add
pin projections to `#[pin_data]`").

You can only project the entire struct at once, but most of the time
that should be sufficient.

---
Cheers,
Benno
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Joel Fernandes 2 weeks, 4 days ago
Hi Miguel,

On Sat, Sep 13, 2025 at 10:37:34PM +0200, Miguel Ojeda wrote:
> On Sat, Sep 13, 2025 at 7:14 PM Joel Fernandes <joelagnelf@nvidia.com> wrote:
> >
> > I am not alone in that opinion.
> 
> Hmm... I am not sure how to read this.

I don't follow? I am just saying that pinning seems to be a rather odd thing
to do explictly that we don't in other languages. I know why Rust needs it,
but it is even more strange that Rust requires additional macros in the
kernel to further implement features required for it, when it probably should
been a part of the language design from the beginning (and not require
changes to macros to support usecases, like we saw for Nova).

It is possible that I don't fully understand pinning - but what I do
understand (and please correct me if I'm wrong), pinning is required mainly
because Rust default-moves data which wreaks havoc for stuff that unsafe code
(C code) exposes (spinlocks for example) or data hardware expects to be at
fixed locations. This also arises because of mixing unsafe code, with safe
code.  So it is required mainly because Unsafe code (and hardware) does not
break, due to "safe" Rust's default movement. Is my understanding wrong?

> > This should be first-class in a (systems) language, built into
> > the language itself?
> 
> I would suggest taking a look at our website and the links there (like
> issue #2) -- what we are doing upstream Rust is documented.

Sure, thanks for the pointers, I will study them further.

> (Danilo gave you a direct link, but I mention it this way because
> there are a lot of things going on, and it is worth a look and perhaps
> you may find something interesting you could help with).

Sure, thanks.

> > except to satisfy paranoia
> 
> Using unsafe code everywhere (or introducing unsoundness or UB for
> convenience) would defeat much of the Rust for Linux exercise.

Where in the thread did I suggest that, though? (using unsafe everywhere).

Also note, I'm pretty much a Rust (and Rust 4 linux) newbie and don't claim
to be any kind of expert in these. But I have studied Rust for about a year
now and pinning for a few months :). I don't mean to make noise, just
discussing so I can learn more and help in some way :) Please don't be too
offended by my ramblings.

thanks,

 - Joel

Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Danilo Krummrich 2 weeks, 4 days ago
On Sat Sep 13, 2025 at 7:13 PM CEST, Joel Fernandes wrote:
> On Sat, Sep 13, 2025 at 03:30:31PM +0200, Danilo Krummrich wrote:
>> On Sat Sep 13, 2025 at 3:02 AM CEST, Joel Fernandes wrote:
>> > Any chance we can initialize the locks later? We don't need locking until
>> > after the boot process is completed, and if there's a way we can dynamically
>> > "pin", where we hypothetically pin after the boot process completed, that
>> > might also work. Though I am not sure if that's something possible in
>> > Rust/rust4linux or if it makes sense.
>> 
>> We can't partially initialize structures and then rely on accessing initialized
>> data only.
>
> Yet, that is exactly what the pin initialization sequence block does? The
> whole structure is not initialized yet you need access to already initialized
> fields.

No, having a reference to a partially initialized structure is UB. But of course
you can have a reference to already initialized fields within a not yet fully
initialized structure.

>> This is one of the sources for memory bugs that Rust tries to solve.
>> You can wrap fields into Option types and initialize them later, which would
>> defer pin-init calls for the price of having Option fields around.
>
> I am not denying the need for pinning. Also regarding Option, just thinking
> out loud but if something is optional temporary, maybe needing a new type
> like TempOption, and promote it to a non-option type later, I am not seeing
> that as completely outside the world, if there is a legitimate usecase that
> needs to be Option temporarily, but not later, what's wrong with that? It is
> "Optional" for the timebeing, but not later.

That's what MaybeUninit<T> from the core library already does and promoting it
to T is fundamentally unsafe for obvious reasons.

Drivers should never use that. Having partially initialized structures is a
horrible anti-pattern that we see too often in C code (only for convinience
reasons) causing real memory bugs.

If you run into a case where you want this, 99% of the time you have a design
issue that you should fix instead.

>> However, we should never do such things. If there's the necessity to do
>> something like that, it indicates a design issue.
>> 
>> In this case, there's no problem, we can use pin-init without any issues right
>> away, and should do so.
>> 
>> pin-init is going to be an essential part of *every* Rust driver given that a
>> lot of the C infrastruture that we abstract requires pinned initialization, such
>> as locks and other synchronization primitives.
>
> To be honest, the pinning concept seems like an after thought for such a
> fundamental thing that we need, requiring additional macros, and bandaids on
> top of the language itself, to make it work for the kernel. I am not alone in
> that opinion. This should be first-class in a (systems) language, built into
> the language itself? I am talking about the whole pin initialization,
> accessing fields dances, etc.

Yes, that's exactly why people (Benno) are already working on making this a
language feature (here's a first step in this direction [1]).

Benno should have more details on this.

[1] https://github.com/rust-lang/rust/pull/146307

> Also I am concerned that overusage of pinning defeats a lot of optimizations

pin-init does the oposite it allows us to use a single memory allocation where
otherwise you would need multiple.

Can you please show some optimizations that can not be done in drivers due to
pin-init for dynamic allocations?

Or in other words, what are the things you want to do with a KBox<T> in drivers
that you can't do with a Pin<KBox<T>> in a more optimal way?

> that Rust may be able to perform, especially forcefully pinning stuff that
> does not need all to be pinned (except to satisfy paranoia),

Can you please present some examples where it is a major advantage to be able to
move out of a Box in drivers? I think you will have a hard time finding them.

In C code, how often do you move fields out of structures that live within a
kmalloc() allocation?

> thus generating
> suboptimal code gen. Not only does it require redesign and concerns over
> accesses to un-initialized fields,

We're not doing any accesses to uninitialized fields with pin-init, nor do we
encourage them.

> like we saw in the last 2-3 weeks, it also
> forces people into that when maybe there is a chance that underlying
> structures do not need to be pinned at all (for some usecases).

Again, what are those use-cases where you want to be able to move out of a Box
in drivers?
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Joel Fernandes 2 weeks, 4 days ago
Hi Danilo,

On Sat, Sep 13, 2025 at 09:53:16PM +0200, Danilo Krummrich wrote:
> On Sat Sep 13, 2025 at 7:13 PM CEST, Joel Fernandes wrote:
> > On Sat, Sep 13, 2025 at 03:30:31PM +0200, Danilo Krummrich wrote:
> >> On Sat Sep 13, 2025 at 3:02 AM CEST, Joel Fernandes wrote:
> >> > Any chance we can initialize the locks later? We don't need locking until
> >> > after the boot process is completed, and if there's a way we can dynamically
> >> > "pin", where we hypothetically pin after the boot process completed, that
> >> > might also work. Though I am not sure if that's something possible in
> >> > Rust/rust4linux or if it makes sense.
> >> 
> >> We can't partially initialize structures and then rely on accessing initialized
> >> data only.
> >
> > Yet, that is exactly what the pin initialization sequence block does? The
> > whole structure is not initialized yet you need access to already initialized
> > fields.
> 
> No, having a reference to a partially initialized structure is UB. But of course
> you can have a reference to already initialized fields within a not yet fully
> initialized structure.

Fair enough.

> >> However, we should never do such things. If there's the necessity to do
> >> something like that, it indicates a design issue.
> >> 
> >> In this case, there's no problem, we can use pin-init without any issues right
> >> away, and should do so.
> >> 
> >> pin-init is going to be an essential part of *every* Rust driver given that a
> >> lot of the C infrastruture that we abstract requires pinned initialization, such
> >> as locks and other synchronization primitives.
> >
> > To be honest, the pinning concept seems like an after thought for such a
> > fundamental thing that we need, requiring additional macros, and bandaids on
> > top of the language itself, to make it work for the kernel. I am not alone in
> > that opinion. This should be first-class in a (systems) language, built into
> > the language itself? I am talking about the whole pin initialization,
> > accessing fields dances, etc.
> 
> Yes, that's exactly why people (Benno) are already working on making this a
> language feature (here's a first step in this direction [1]).
> 
> Benno should have more details on this.
> 
> [1] https://github.com/rust-lang/rust/pull/146307

Ack, thanks for the pointer. I will study it further.

> > Also I am concerned that overusage of pinning defeats a lot of optimizations
> 
> pin-init does the oposite it allows us to use a single memory allocation where
> otherwise you would need multiple.
> 
> Can you please show some optimizations that can not be done in drivers due to
> pin-init for dynamic allocations?

Aren't the vector resizing issues an example? The debugfs discussions for
example. You can't resize pinned vectors without boxing each element which is
suboptimal due to requiring additional allocations?

But agreed, it appears maybe this isn't as much an issue as I thought. I
think I confused prevention of stuff allocated on the stack from moving, with
pinning. I think the only other reason I can see, is to not to reduce code
readability if pinning is really not needed and if it is used, to add
appropriate code comments.

Thank you for taking the time to explain this to me. I really appreciate it,
and please let me know if I missed something!

 - Joel
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Benno Lossin 2 weeks, 4 days ago
On Sun Sep 14, 2025 at 1:02 AM CEST, Joel Fernandes wrote:
> On Sat, Sep 13, 2025 at 09:53:16PM +0200, Danilo Krummrich wrote:
>> On Sat Sep 13, 2025 at 7:13 PM CEST, Joel Fernandes wrote:
>> > On Sat, Sep 13, 2025 at 03:30:31PM +0200, Danilo Krummrich wrote:
>> >> However, we should never do such things. If there's the necessity to do
>> >> something like that, it indicates a design issue.
>> >> 
>> >> In this case, there's no problem, we can use pin-init without any issues right
>> >> away, and should do so.
>> >> 
>> >> pin-init is going to be an essential part of *every* Rust driver given that a
>> >> lot of the C infrastruture that we abstract requires pinned initialization, such
>> >> as locks and other synchronization primitives.
>> >
>> > To be honest, the pinning concept seems like an after thought for such a
>> > fundamental thing that we need, requiring additional macros, and bandaids on
>> > top of the language itself, to make it work for the kernel. I am not alone in
>> > that opinion. This should be first-class in a (systems) language, built into
>> > the language itself? I am talking about the whole pin initialization,
>> > accessing fields dances, etc.
>> 
>> Yes, that's exactly why people (Benno) are already working on making this a
>> language feature (here's a first step in this direction [1]).
>> 
>> Benno should have more details on this.
>> 
>> [1] https://github.com/rust-lang/rust/pull/146307

That's the link to the implementation PR, if you know the internals of
the compiler it sure is useful, but if not, only the first comment is :)

> Ack, thanks for the pointer. I will study it further.

I'd recommend looking at these links, as they talk more about the design
& not the compiler implementation:

* https://github.com/rust-lang/rust/issues/145383
* https://hackmd.io/@rust-lang-team/S1I1aEc_lx
* https://rust-lang.github.io/rust-project-goals/2025h2/field-projections.html

For pin specifically, there also is the pin-ergonomics effort:

* https://github.com/rust-lang/rust/issues/130494

Which is less general than the field projections that I'm working on,
but more specific to pin & tries to make it more compiler internal.

Now for pinned initialization, Alice has a project goal & proposal:

* https://rust-lang.github.io/rust-project-goals/2025h2/in-place-initialization.html
* https://hackmd.io/%40aliceryhl/BJutRcPblx

This proposal was heavily influenced by pin-init & we're actively
working together with others from the Rust community in getting this to
a language feature.

It's a pretty complicated feature and people just worked around it
before, which you can do when starting from the ground-up (similar to
field projections).

>> > Also I am concerned that overusage of pinning defeats a lot of optimizations
>> 
>> pin-init does the oposite it allows us to use a single memory allocation where
>> otherwise you would need multiple.
>> 
>> Can you please show some optimizations that can not be done in drivers due to
>> pin-init for dynamic allocations?
>
> Aren't the vector resizing issues an example? The debugfs discussions for
> example. You can't resize pinned vectors without boxing each element which is
> suboptimal due to requiring additional allocations?

Yes, but that's not really an optimization, is it? In the non-pinned
case, the compiler wouldn't remove the allocation. You can select less
efficient algorithms, since the objects aren't allowed to move, but that
same restriction also applies in C.

---
Cheers,
Benno
Re: [PATCH v5 02/12] gpu: nova-core: move GSP boot code to a dedicated method
Posted by Benno Lossin 3 weeks ago
On Thu Sep 11, 2025 at 3:26 PM CEST, Alexandre Courbot wrote:
> On Thu Sep 11, 2025 at 9:46 PM JST, Danilo Krummrich wrote:
>> On 9/11/25 2:17 PM, Alexandre Courbot wrote:
>>> You can see the whole process on [1]. `libos` is the object that is
>>> returned (although its name and type will change). All the rest it
>>> loading, preparing and running firmware, and that is done on the GPU. I
>>> think it would be very out of place in the GSP module.
>>> 
>>> It is also very step-by-step: run this firmware, wait for it to
>>> complete, run another one, wait for a specific message from the GSP, run
>>> the sequencer, etc. And most of this stuff is thrown away once the GSP
>>> is running. That's where the limits of what we can do with `pin_init!`
>>> are reached, and the GSP object doesn't need to be pinned anyway.
>>
>> I don't see that, in the code you linked you have a bunch of calls that don't
>> return anything that needs to survive, this can be in an initializer block.
>>
>> And then you have
>>
>> let mut libos = gsp::GspMemObjects::new(pdev, bar)?;
>>
>> which only needs the device reference and the bar reference.
>>
>> So you can easily write this as:
>>
>> try_pin_init!(Self {
>>    _: {
>>       // all the throw-away stuff from above
>>    },
>>    libos <- gsp::GspMemObjects::new(pdev, bar),
>>    _: {
>>       libos.do_some_stuff_mutable()?;
>>    }
>> })
>
> Can the second initializer block access variables created in the first?

No, that's not yet possible :( but I'll make it work next cycle.

---
Cheers,
Benno
[PATCH v5 03/12] gpu: nova-core: initialize Gpu structure fully in-place
Posted by Alexandre Courbot 3 weeks ago
This is more idiomatic when working with pinned objects, and lets us
move the Error from the `Gpu::new` method into the pin initializer it
returns.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs |  9 ++++--
 drivers/gpu/nova-core/gpu.rs    | 68 +++++++++++++++++++++--------------------
 2 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 02b3edd7bbdccb22d75db5999eb9b4a71cef58c1..1380b47617f7b387666779fbf69e6933860183c0 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -34,14 +34,19 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> Result<Pin<KBox<Self
         pdev.enable_device_mem()?;
         pdev.set_master();
 
-        let bar = Arc::pin_init(
+        let devres_bar = Arc::pin_init(
             pdev.iomap_region_sized::<BAR0_SIZE>(0, c_str!("nova-core/bar0")),
             GFP_KERNEL,
         )?;
 
+        // Used to provided a `&Bar0` to `Gpu::new` without tying it to the lifetime of
+        // `devres_bar`.
+        let bar_clone = Arc::clone(&devres_bar);
+        let bar = bar_clone.access(pdev.as_ref())?;
+
         let this = KBox::pin_init(
             try_pin_init!(Self {
-                gpu <- Gpu::new(pdev, bar)?,
+                gpu <- Gpu::new(pdev, devres_bar, bar),
                 _reg: auxiliary::Registration::new(
                     pdev.as_ref(),
                     c_str!("nova-drm"),
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index c8f876698b2e5da1d4250af377163a3f07a5ded0..92fb0e4765ed322484672a1e01568216c3e0a7db 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -172,6 +172,10 @@ pub(crate) struct Gpu {
     /// System memory page required for flushing all pending GPU-side memory writes done through
     /// PCIE into system memory, via sysmembar (A GPU-initiated HW memory-barrier operation).
     sysmem_flush: SysmemFlush,
+    /// GSP falcon instance, used for GSP bootup and cleanup.
+    gsp_falcon: Falcon<Gsp>,
+    /// SEC2 falcon instance, used for GSP bootup and cleanup.
+    sec2_falcon: Falcon<Sec2>,
     /// GSP runtime data - temporarily a empty placeholder.
     gsp: (),
 }
@@ -280,48 +284,46 @@ pub(crate) fn start_gsp(
         Ok(())
     }
 
-    pub(crate) fn new(
-        pdev: &pci::Device<device::Bound>,
+    pub(crate) fn new<'a>(
+        pdev: &'a pci::Device<device::Bound>,
         devres_bar: Arc<Devres<Bar0>>,
-    ) -> Result<impl PinInit<Self>> {
-        let bar = devres_bar.access(pdev.as_ref())?;
-        let spec = Spec::new(bar)?;
-        let fw = Firmware::new(pdev.as_ref(), spec.chipset, FIRMWARE_VERSION)?;
+        bar: &'a Bar0,
+    ) -> impl PinInit<Self, Error> + 'a {
+        try_pin_init!(Self {
+            spec: Spec::new(bar).inspect(|spec| {
+                dev_info!(
+                    pdev.as_ref(),
+                    "NVIDIA (Chipset: {}, Architecture: {:?}, Revision: {})\n",
+                    spec.chipset,
+                    spec.chipset.arch(),
+                    spec.revision
+                );
+            })?,
 
-        dev_info!(
-            pdev.as_ref(),
-            "NVIDIA (Chipset: {}, Architecture: {:?}, Revision: {})\n",
-            spec.chipset,
-            spec.chipset.arch(),
-            spec.revision
-        );
+            // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
+            _: {
+                gfw::wait_gfw_boot_completion(bar)
+                    .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?;
+            },
 
-        // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
-        gfw::wait_gfw_boot_completion(bar)
-            .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?;
+            fw <- Firmware::new(pdev.as_ref(), spec.chipset, FIRMWARE_VERSION)?,
 
-        let sysmem_flush = SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?;
+            sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
 
-        let gsp_falcon = Falcon::<Gsp>::new(
-            pdev.as_ref(),
-            spec.chipset,
-            bar,
-            spec.chipset > Chipset::GA100,
-        )?;
-        gsp_falcon.clear_swgen0_intr(bar);
+            gsp_falcon: Falcon::<Gsp>::new(
+                pdev.as_ref(),
+                spec.chipset,
+                bar,
+                spec.chipset > Chipset::GA100,
+            )
+            .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
 
-        let sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
+            sec2_falcon: Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?,
 
-        #[allow(clippy::let_unit_value)]
-        let gsp = Self::start_gsp(pdev, bar, spec.chipset, &gsp_falcon, &sec2_falcon)?;
+            gsp <- Self::start_gsp(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)?,
 
-        Ok(pin_init!(Self {
-            spec,
             bar: devres_bar,
-            fw,
-            sysmem_flush,
-            gsp,
-        }))
+        })
     }
 
     /// Called when the corresponding [`Device`](device::Device) is unbound.

-- 
2.51.0
[PATCH v5 04/12] gpu: nova-core: add Chipset::name() method
Posted by Alexandre Courbot 3 weeks ago
There are a few cases where we need the lowercase name of a given
chipset, notably to resolve firmware files paths for dynamic loading or
to build the module information.

So far, we relied on a static `NAMES` array for the latter, and some
CString hackery for the former.

Replace both with a new `name` const method that returns the lowercase
name of a chipset instance. We can generate it using the `paste!` macro.

Using this method removes the need to create a `CString` when loading
firmware, and lets us remove a couple of utility functions that now have
no user.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs |  8 +++-----
 drivers/gpu/nova-core/gpu.rs      | 25 +++++++++++++++++--------
 drivers/gpu/nova-core/util.rs     | 20 --------------------
 3 files changed, 20 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 2931912ddba0ea1fe6d027ccec70b39cdb40344a..213d4506a53f83e7474861f6f81f033a9760fb61 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -30,9 +30,7 @@ pub(crate) struct Firmware {
 
 impl Firmware {
     pub(crate) fn new(dev: &device::Device, chipset: Chipset, ver: &str) -> Result<Firmware> {
-        let mut chip_name = CString::try_from_fmt(fmt!("{chipset}"))?;
-        chip_name.make_ascii_lowercase();
-        let chip_name = &*chip_name;
+        let chip_name = chipset.name();
 
         let request = |name_| {
             CString::try_from_fmt(fmt!("nvidia/{chip_name}/gsp/{name_}-{ver}.bin"))
@@ -180,8 +178,8 @@ pub(crate) const fn create(
         let mut this = Self(firmware::ModInfoBuilder::new(module_name));
         let mut i = 0;
 
-        while i < gpu::Chipset::NAMES.len() {
-            this = this.make_entry_chipset(gpu::Chipset::NAMES[i]);
+        while i < gpu::Chipset::ALL.len() {
+            this = this.make_entry_chipset(gpu::Chipset::ALL[i].name());
             i += 1;
         }
 
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 92fb0e4765ed322484672a1e01568216c3e0a7db..1dfd085189cbf188f9cfa829eb0cb7484d9c4d62 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -10,7 +10,6 @@
 use crate::firmware::{Firmware, FIRMWARE_VERSION};
 use crate::gfw;
 use crate::regs;
-use crate::util;
 use crate::vbios::Vbios;
 use core::fmt;
 
@@ -28,13 +27,23 @@ impl Chipset {
                 $( Chipset::$variant, )*
             ];
 
-            pub(crate) const NAMES: [&'static str; Self::ALL.len()] = [
-                $( util::const_bytes_to_str(
-                        util::to_lowercase_bytes::<{ stringify!($variant).len() }>(
-                            stringify!($variant)
-                        ).as_slice()
-                ), )*
-            ];
+            ::kernel::macros::paste!(
+            /// Returns the name of this chipset, in lowercase.
+            ///
+            /// # Examples
+            ///
+            /// ```
+            /// let chipset = Chipset::GA102;
+            /// assert_eq!(chipset.name(), "ga102");
+            /// ```
+            pub(crate) const fn name(&self) -> &'static str {
+                match *self {
+                $(
+                    Chipset::$variant => stringify!([<$variant:lower>]),
+                )*
+                }
+            }
+            );
         }
 
         // TODO[FPRI]: replace with something like derive(FromPrimitive)
diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs
index 76cedf3710d7bb248f62ed50381a70f8ffb3f19a..bf35f00cb732ee4fa6644854718a0ad99051666a 100644
--- a/drivers/gpu/nova-core/util.rs
+++ b/drivers/gpu/nova-core/util.rs
@@ -3,26 +3,6 @@
 use kernel::prelude::*;
 use kernel::time::{Delta, Instant, Monotonic};
 
-pub(crate) const fn to_lowercase_bytes<const N: usize>(s: &str) -> [u8; N] {
-    let src = s.as_bytes();
-    let mut dst = [0; N];
-    let mut i = 0;
-
-    while i < src.len() && i < N {
-        dst[i] = (src[i] as char).to_ascii_lowercase() as u8;
-        i += 1;
-    }
-
-    dst
-}
-
-pub(crate) const fn const_bytes_to_str(bytes: &[u8]) -> &str {
-    match core::str::from_utf8(bytes) {
-        Ok(string) => string,
-        Err(_) => kernel::build_error!("Bytes are not valid UTF-8."),
-    }
-}
-
 /// Wait until `cond` is true or `timeout` elapsed.
 ///
 /// When `cond` evaluates to `Some`, its return value is returned.

-- 
2.51.0
[PATCH v5 05/12] gpu: nova-core: firmware: move firmware request code into a function
Posted by Alexandre Courbot 3 weeks ago
When all the firmware files are loaded from `Firmware::new`, it makes
sense to have the firmware request code as a closure. However, since we
eventually want each individual firmware constructor to request its own
file (and get rid of `Firmware` altogether), move this code into a
dedicated function that can be called by individual firmware types.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 213d4506a53f83e7474861f6f81f033a9760fb61..4e8654d294a2205ac6f0b05b6da8d959a415ced1 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -19,6 +19,19 @@
 
 pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
 
+/// Requests the GPU firmware `name` suitable for `chipset`, with version `ver`.
+fn request_nv_firmware(
+    dev: &device::Device,
+    chipset: gpu::Chipset,
+    name: &str,
+    ver: &str,
+) -> Result<firmware::Firmware> {
+    let chip_name = chipset.name();
+
+    CString::try_from_fmt(fmt!("nvidia/{chip_name}/gsp/{name}-{ver}.bin"))
+        .and_then(|path| firmware::Firmware::request(&path, dev))
+}
+
 /// Structure encapsulating the firmware blobs required for the GPU to operate.
 #[expect(dead_code)]
 pub(crate) struct Firmware {
@@ -30,12 +43,7 @@ pub(crate) struct Firmware {
 
 impl Firmware {
     pub(crate) fn new(dev: &device::Device, chipset: Chipset, ver: &str) -> Result<Firmware> {
-        let chip_name = chipset.name();
-
-        let request = |name_| {
-            CString::try_from_fmt(fmt!("nvidia/{chip_name}/gsp/{name_}-{ver}.bin"))
-                .and_then(|path| firmware::Firmware::request(&path, dev))
-        };
+        let request = |name| request_nv_firmware(dev, chipset, name, ver);
 
         Ok(Firmware {
             booter_load: request("booter_load")?,

-- 
2.51.0
Re: [PATCH v5 05/12] gpu: nova-core: firmware: move firmware request code into a function
Posted by Danilo Krummrich 3 weeks ago
On 9/11/25 1:04 PM, Alexandre Courbot wrote:
> +/// Requests the GPU firmware `name` suitable for `chipset`, with version `ver`.
> +fn request_nv_firmware(

I think just request_firmware() is fine.
Re: [PATCH v5 05/12] gpu: nova-core: firmware: move firmware request code into a function
Posted by Alexandre Courbot 3 weeks ago
On Thu Sep 11, 2025 at 8:23 PM JST, Danilo Krummrich wrote:
> On 9/11/25 1:04 PM, Alexandre Courbot wrote:
>> +/// Requests the GPU firmware `name` suitable for `chipset`, with version `ver`.
>> +fn request_nv_firmware(
>
> I think just request_firmware() is fine.

Sounds good!
[PATCH v5 06/12] gpu: nova-core: firmware: add support for common firmware header
Posted by Alexandre Courbot 3 weeks ago
Several firmware files loaded from userspace feature a common header
that describes their payload. Add basic support for it so subsequent
patches can leverage it.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 62 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 4e8654d294a2205ac6f0b05b6da8d959a415ced1..32b685c8757b1106084577c2cc7d5ef6056c1c64 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -4,11 +4,13 @@
 //! to be loaded into a given execution unit.
 
 use core::marker::PhantomData;
+use core::mem::size_of;
 
 use kernel::device;
 use kernel::firmware;
 use kernel::prelude::*;
 use kernel::str::CString;
+use kernel::transmute::FromBytes;
 
 use crate::dma::DmaObject;
 use crate::falcon::FalconFirmware;
@@ -156,6 +158,66 @@ fn no_patch_signature(self) -> FirmwareDmaObject<F, Signed> {
     }
 }
 
+/// Header common to most firmware files.
+#[repr(C)]
+#[derive(Debug, Clone)]
+struct BinHdr {
+    /// Magic number, must be `0x10de`.
+    bin_magic: u32,
+    /// Version of the header.
+    bin_ver: u32,
+    /// Size in bytes of the binary (to be ignored).
+    bin_size: u32,
+    /// Offset of the start of the application-specific header.
+    header_offset: u32,
+    /// Offset of the start of the data payload.
+    data_offset: u32,
+    /// Size in bytes of the data payload.
+    data_size: u32,
+}
+
+// SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+unsafe impl FromBytes for BinHdr {}
+
+// A firmware blob starting with a `BinHdr`.
+struct BinFirmware<'a> {
+    hdr: BinHdr,
+    fw: &'a [u8],
+}
+
+#[expect(dead_code)]
+impl<'a> BinFirmware<'a> {
+    /// Interpret `fw` as a firmware image starting with a [`BinHdr`], and returns the
+    /// corresponding [`BinFirmware`] that can be used to extract its payload.
+    fn new(fw: &'a firmware::Firmware) -> Result<Self> {
+        const BIN_MAGIC: u32 = 0x10de;
+        let fw = fw.data();
+
+        fw.get(0..size_of::<BinHdr>())
+            // Extract header.
+            .and_then(BinHdr::from_bytes_copy)
+            // Validate header.
+            .and_then(|hdr| {
+                if hdr.bin_magic == BIN_MAGIC {
+                    Some(hdr)
+                } else {
+                    None
+                }
+            })
+            .map(|hdr| Self { hdr, fw })
+            .ok_or(EINVAL)
+    }
+
+    /// Returns the data payload of the firmware, or `None` if the data range is out of bounds of
+    /// the firmware image.
+    fn data(&self) -> Option<&[u8]> {
+        let fw_start = self.hdr.data_offset as usize;
+        let fw_size = self.hdr.data_size as usize;
+
+        self.fw.get(fw_start..fw_start + fw_size)
+    }
+}
+
 pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>);
 
 impl<const N: usize> ModInfoBuilder<N> {

-- 
2.51.0
[PATCH v5 07/12] gpu: nova-core: firmware: process Booter and patch its signature
Posted by Alexandre Courbot 3 weeks ago
The Booter signed firmware is an essential part of bringing up the GSP
on Turing and Ampere. It is loaded on the sec2 falcon core and is
responsible for loading and running the RISC-V GSP bootloader into the
GSP core.

Add support for parsing the Booter firmware loaded from userspace, patch
its signatures, and store it into a form that is ready to be loaded and
executed on the sec2 falcon.

Then, move the Booter instance from the `Firmware` struct to the
`start_gsp` method since it doesn't need to be kept after the GSP is
booted.

We do not run Booter yet, as its own payload (the GSP bootloader and
firmware image) still need to be prepared.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/falcon.rs          |   4 +-
 drivers/gpu/nova-core/firmware.rs        |   6 +-
 drivers/gpu/nova-core/firmware/booter.rs | 375 +++++++++++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs             |  12 +-
 4 files changed, 389 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index b16207e7242fe1ac26b8552575b8b4e52f34cf6a..37e6298195e49a9a29e81226abe16cd410c9adbc 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -293,7 +293,7 @@ pub(crate) trait FalconEngine:
 }
 
 /// Represents a portion of the firmware to be loaded into a particular memory (e.g. IMEM or DMEM).
-#[derive(Debug)]
+#[derive(Debug, Clone)]
 pub(crate) struct FalconLoadTarget {
     /// Offset from the start of the source object to copy from.
     pub(crate) src_start: u32,
@@ -304,7 +304,7 @@ pub(crate) struct FalconLoadTarget {
 }
 
 /// Parameters for the falcon boot ROM.
-#[derive(Debug)]
+#[derive(Debug, Clone)]
 pub(crate) struct FalconBromParams {
     /// Offset in `DMEM`` of the firmware's signature.
     pub(crate) pkc_data_offset: u32,
diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 32b685c8757b1106084577c2cc7d5ef6056c1c64..d954b1e98fda82c44f87d2103e31fa717c392d79 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -17,6 +17,7 @@
 use crate::gpu;
 use crate::gpu::Chipset;
 
+pub(crate) mod booter;
 pub(crate) mod fwsec;
 
 pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
@@ -37,8 +38,6 @@ fn request_nv_firmware(
 /// Structure encapsulating the firmware blobs required for the GPU to operate.
 #[expect(dead_code)]
 pub(crate) struct Firmware {
-    booter_load: firmware::Firmware,
-    booter_unload: firmware::Firmware,
     bootloader: firmware::Firmware,
     gsp: firmware::Firmware,
 }
@@ -48,8 +47,6 @@ pub(crate) fn new(dev: &device::Device, chipset: Chipset, ver: &str) -> Result<F
         let request = |name| request_nv_firmware(dev, chipset, name, ver);
 
         Ok(Firmware {
-            booter_load: request("booter_load")?,
-            booter_unload: request("booter_unload")?,
             bootloader: request("bootloader")?,
             gsp: request("gsp")?,
         })
@@ -185,7 +182,6 @@ struct BinFirmware<'a> {
     fw: &'a [u8],
 }
 
-#[expect(dead_code)]
 impl<'a> BinFirmware<'a> {
     /// Interpret `fw` as a firmware image starting with a [`BinHdr`], and returns the
     /// corresponding [`BinFirmware`] that can be used to extract its payload.
diff --git a/drivers/gpu/nova-core/firmware/booter.rs b/drivers/gpu/nova-core/firmware/booter.rs
new file mode 100644
index 0000000000000000000000000000000000000000..a4cd9c84bea75b68565778841e78a20cdec9333e
--- /dev/null
+++ b/drivers/gpu/nova-core/firmware/booter.rs
@@ -0,0 +1,375 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Support for loading and patching the `Booter` firmware. `Booter` is a Heavy Secured firmware
+//! running on [`Sec2`], that is used on Turing/Ampere to load the GSP firmware into the GSP falcon
+//! (and optionally unload it through a separate firmware image).
+
+use core::marker::PhantomData;
+use core::mem::size_of;
+use core::ops::Deref;
+
+use kernel::device;
+use kernel::prelude::*;
+use kernel::transmute::FromBytes;
+
+use crate::dma::DmaObject;
+use crate::driver::Bar0;
+use crate::falcon::sec2::Sec2;
+use crate::falcon::{Falcon, FalconBromParams, FalconFirmware, FalconLoadParams, FalconLoadTarget};
+use crate::firmware::{BinFirmware, FirmwareDmaObject, FirmwareSignature, Signed, Unsigned};
+use crate::gpu::Chipset;
+
+/// Local convenience function to return a copy of `S` by reinterpreting the bytes starting at
+/// `offset` in `slice`.
+fn frombytes_at<S: FromBytes + Sized>(slice: &[u8], offset: usize) -> Result<S> {
+    slice
+        .get(offset..offset + size_of::<S>())
+        .and_then(S::from_bytes_copy)
+        .ok_or(EINVAL)
+}
+
+/// Heavy-Secured firmware header.
+///
+/// Such firmwares have an application-specific payload that needs to be patched with a given
+/// signature.
+#[repr(C)]
+#[derive(Debug, Clone)]
+struct HsHeaderV2 {
+    /// Offset to the start of the signatures.
+    sig_prod_offset: u32,
+    /// Size in bytes of the signatures.
+    sig_prod_size: u32,
+    /// Offset to a `u32` containing the location at which to patch the signature in the microcode
+    /// image.
+    patch_loc_offset: u32,
+    /// Offset to a `u32` containing the index of the signature to patch.
+    patch_sig_offset: u32,
+    /// Start offset to the signature metadata.
+    meta_data_offset: u32,
+    /// Size in bytes of the signature metadata.
+    meta_data_size: u32,
+    /// Offset to a `u32` containing the number of signatures in the signatures section.
+    num_sig_offset: u32,
+    /// Offset of the application-specific header.
+    header_offset: u32,
+    /// Size in bytes of the application-specific header.
+    header_size: u32,
+}
+
+// SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+unsafe impl FromBytes for HsHeaderV2 {}
+
+/// Heavy-Secured Firmware image container.
+///
+/// This provides convenient access to the fields of [`HsHeaderV2`] that are actually indices to
+/// read from in the firmware data.
+struct HsFirmwareV2<'a> {
+    hdr: HsHeaderV2,
+    fw: &'a [u8],
+}
+
+impl<'a> HsFirmwareV2<'a> {
+    /// Interprets the header of `bin_fw` as a [`HsHeaderV2`] and returns an instance of
+    /// `HsFirmwareV2` for further parsing.
+    ///
+    /// Fails if the header pointed at by `bin_fw` is not within the bounds of the firmware image.
+    fn new(bin_fw: &BinFirmware<'a>) -> Result<Self> {
+        frombytes_at::<HsHeaderV2>(bin_fw.fw, bin_fw.hdr.header_offset as usize)
+            .map(|hdr| Self { hdr, fw: bin_fw.fw })
+    }
+
+    /// Returns the location at which the signatures should be patched in the microcode image.
+    ///
+    /// Fails if the offset of the patch location is outside the bounds of the firmware
+    /// image.
+    fn patch_location(&self) -> Result<u32> {
+        frombytes_at::<u32>(self.fw, self.hdr.patch_loc_offset as usize)
+    }
+
+    /// Returns an iterator to the signatures of the firmware. The iterator can be empty if the
+    /// firmware is unsigned.
+    ///
+    /// Fails if the pointed signatures are outside the bounds of the firmware image.
+    fn signatures_iter(&'a self) -> Result<impl Iterator<Item = BooterSignature<'a>>> {
+        let num_sig = frombytes_at::<u32>(self.fw, self.hdr.num_sig_offset as usize)?;
+        let iter = match self.hdr.sig_prod_size.checked_div(num_sig) {
+            // If there are no signatures, return an iterator that will yield zero elements.
+            None => (&[] as &[u8]).chunks_exact(1),
+            Some(sig_size) => {
+                let patch_sig = frombytes_at::<u32>(self.fw, self.hdr.patch_sig_offset as usize)?;
+                let signatures_start = (self.hdr.sig_prod_offset + patch_sig) as usize;
+
+                self.fw
+                    // Get signatures range.
+                    .get(signatures_start..signatures_start + self.hdr.sig_prod_size as usize)
+                    .ok_or(EINVAL)?
+                    .chunks_exact(sig_size as usize)
+            }
+        };
+
+        // Map the byte slices into signatures.
+        Ok(iter.map(BooterSignature))
+    }
+}
+
+/// Signature parameters, as defined in the firmware.
+#[repr(C)]
+struct HsSignatureParams {
+    /// Fuse version to use.
+    fuse_ver: u32,
+    /// Mask of engine IDs this firmware applies to.
+    engine_id_mask: u32,
+    /// ID of the microcode.
+    ucode_id: u32,
+}
+
+// SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+unsafe impl FromBytes for HsSignatureParams {}
+
+impl HsSignatureParams {
+    /// Returns the signature parameters contained in `hs_fw`.
+    ///
+    /// Fails if the meta data parameter of `hs_fw` is outside the bounds of the firmware image, or
+    /// if its size doesn't match that of [`HsSignatureParams`].
+    fn new(hs_fw: &HsFirmwareV2<'_>) -> Result<Self> {
+        let start = hs_fw.hdr.meta_data_offset as usize;
+        let end = start
+            .checked_add(hs_fw.hdr.meta_data_size as usize)
+            .ok_or(EINVAL)?;
+
+        hs_fw
+            .fw
+            .get(start..end)
+            .and_then(Self::from_bytes_copy)
+            .ok_or(EINVAL)
+    }
+}
+
+/// Header for code and data load offsets.
+#[repr(C)]
+#[derive(Debug, Clone)]
+struct HsLoadHeaderV2 {
+    // Offset at which the code starts.
+    os_code_offset: u32,
+    // Total size of the code, for all apps.
+    os_code_size: u32,
+    // Offset at which the data starts.
+    os_data_offset: u32,
+    // Size of the data.
+    os_data_size: u32,
+    // Number of apps following this header. Each app is described by a [`HsLoadHeaderV2App`].
+    num_apps: u32,
+}
+
+// SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+unsafe impl FromBytes for HsLoadHeaderV2 {}
+
+impl HsLoadHeaderV2 {
+    /// Returns the load header contained in `hs_fw`.
+    ///
+    /// Fails if the header pointed at by `hs_fw` is not within the bounds of the firmware image.
+    fn new(hs_fw: &HsFirmwareV2<'_>) -> Result<Self> {
+        frombytes_at::<Self>(hs_fw.fw, hs_fw.hdr.header_offset as usize)
+    }
+}
+
+/// Header for app code loader.
+#[repr(C)]
+#[derive(Debug, Clone)]
+struct HsLoadHeaderV2App {
+    /// Offset at which to load the app code.
+    offset: u32,
+    /// Length in bytes of the app code.
+    len: u32,
+}
+
+// SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+unsafe impl FromBytes for HsLoadHeaderV2App {}
+
+impl HsLoadHeaderV2App {
+    /// Returns the [`HsLoadHeaderV2App`] for app `idx` of `hs_fw`.
+    ///
+    /// Fails if `idx` is larger than the number of apps declared in `hs_fw`, or if the header is
+    /// not within the bounds of the firmware image.
+    fn new(hs_fw: &HsFirmwareV2<'_>, idx: u32) -> Result<Self> {
+        let load_hdr = HsLoadHeaderV2::new(hs_fw)?;
+        if idx >= load_hdr.num_apps {
+            Err(EINVAL)
+        } else {
+            frombytes_at::<Self>(
+                hs_fw.fw,
+                (hs_fw.hdr.header_offset as usize)
+                    // Skip the load header...
+                    .checked_add(size_of::<HsLoadHeaderV2>())
+                    // ... and jump to app header `idx`.
+                    .and_then(|offset| {
+                        offset.checked_add((idx as usize).checked_mul(size_of::<Self>())?)
+                    })
+                    .ok_or(EINVAL)?,
+            )
+        }
+    }
+}
+
+/// Signature for Booter firmware. Their size is encoded into the header and not known a compile
+/// time, so we just wrap a byte slices on which we can implement [`FirmwareSignature`].
+struct BooterSignature<'a>(&'a [u8]);
+
+impl<'a> AsRef<[u8]> for BooterSignature<'a> {
+    fn as_ref(&self) -> &[u8] {
+        self.0
+    }
+}
+
+impl<'a> FirmwareSignature<BooterFirmware> for BooterSignature<'a> {}
+
+/// The `Booter` loader firmware, responsible for loading the GSP.
+pub(crate) struct BooterFirmware {
+    // Load parameters for `IMEM` falcon memory.
+    imem_load_target: FalconLoadTarget,
+    // Load parameters for `DMEM` falcon memory.
+    dmem_load_target: FalconLoadTarget,
+    // BROM falcon parameters.
+    brom_params: FalconBromParams,
+    // Device-mapped firmware image.
+    ucode: FirmwareDmaObject<Self, Signed>,
+}
+
+impl FirmwareDmaObject<BooterFirmware, Unsigned> {
+    fn new_booter(dev: &device::Device<device::Bound>, data: &[u8]) -> Result<Self> {
+        DmaObject::from_data(dev, data).map(|ucode| Self(ucode, PhantomData))
+    }
+}
+
+#[derive(Copy, Clone, Debug, PartialEq)]
+pub(crate) enum BooterKind {
+    Loader,
+    #[expect(unused)]
+    Unloader,
+}
+
+impl BooterFirmware {
+    /// Parses the Booter firmware contained in `fw`, and patches the correct signature so it is
+    /// ready to be loaded and run on `falcon`.
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        kind: BooterKind,
+        chipset: Chipset,
+        ver: &str,
+        falcon: &Falcon<<Self as FalconFirmware>::Target>,
+        bar: &Bar0,
+    ) -> Result<Self> {
+        let fw_name = match kind {
+            BooterKind::Loader => "booter_load",
+            BooterKind::Unloader => "booter_unload",
+        };
+        let fw = super::request_nv_firmware(dev, chipset, fw_name, ver)?;
+        let bin_fw = BinFirmware::new(&fw)?;
+
+        // The binary firmware embeds a Heavy-Secured firmware.
+        let hs_fw = HsFirmwareV2::new(&bin_fw)?;
+
+        // The Heavy-Secured firmware embeds a firmware load descriptor.
+        let load_hdr = HsLoadHeaderV2::new(&hs_fw)?;
+
+        // Offset in `ucode` where to patch the signature.
+        let patch_loc = hs_fw.patch_location()?;
+
+        let sig_params = HsSignatureParams::new(&hs_fw)?;
+        let brom_params = FalconBromParams {
+            // `load_hdr.os_data_offset` is an absolute index, but `pkc_data_offset` is from the
+            // signature patch location.
+            pkc_data_offset: patch_loc
+                .checked_sub(load_hdr.os_data_offset)
+                .ok_or(EINVAL)?,
+            engine_id_mask: u16::try_from(sig_params.engine_id_mask).map_err(|_| EINVAL)?,
+            ucode_id: u8::try_from(sig_params.ucode_id).map_err(|_| EINVAL)?,
+        };
+        let app0 = HsLoadHeaderV2App::new(&hs_fw, 0)?;
+
+        // Object containing the firmware microcode to be signature-patched.
+        let ucode = bin_fw
+            .data()
+            .ok_or(EINVAL)
+            .and_then(|data| FirmwareDmaObject::<Self, _>::new_booter(dev, data))?;
+
+        let ucode_signed = {
+            let mut signatures = hs_fw.signatures_iter()?.peekable();
+
+            if signatures.peek().is_none() {
+                // If there are no signatures, then the firmware is unsigned.
+                ucode.no_patch_signature()
+            } else {
+                // Obtain the version from the fuse register, and extract the corresponding
+                // signature.
+                let reg_fuse_version = falcon.signature_reg_fuse_version(
+                    bar,
+                    brom_params.engine_id_mask,
+                    brom_params.ucode_id,
+                )?;
+
+                // `0` means the last signature should be used.
+                const FUSE_VERSION_USE_LAST_SIG: u32 = 0;
+                let signature = match reg_fuse_version {
+                    FUSE_VERSION_USE_LAST_SIG => signatures.last(),
+                    // Otherwise hardware fuse version needs to be subtracted to obtain the index.
+                    reg_fuse_version => {
+                        let Some(idx) = sig_params.fuse_ver.checked_sub(reg_fuse_version) else {
+                            dev_err!(dev, "invalid fuse version for Booter firmware\n");
+                            return Err(EINVAL);
+                        };
+                        signatures.nth(idx as usize)
+                    }
+                }
+                .ok_or(EINVAL)?;
+
+                ucode.patch_signature(&signature, patch_loc as usize)?
+            }
+        };
+
+        Ok(Self {
+            imem_load_target: FalconLoadTarget {
+                src_start: app0.offset,
+                dst_start: 0,
+                len: app0.len,
+            },
+            dmem_load_target: FalconLoadTarget {
+                src_start: load_hdr.os_data_offset,
+                dst_start: 0,
+                len: load_hdr.os_data_size,
+            },
+            brom_params,
+            ucode: ucode_signed,
+        })
+    }
+}
+
+impl FalconLoadParams for BooterFirmware {
+    fn imem_load_params(&self) -> FalconLoadTarget {
+        self.imem_load_target.clone()
+    }
+
+    fn dmem_load_params(&self) -> FalconLoadTarget {
+        self.dmem_load_target.clone()
+    }
+
+    fn brom_params(&self) -> FalconBromParams {
+        self.brom_params.clone()
+    }
+
+    fn boot_addr(&self) -> u32 {
+        self.imem_load_target.src_start
+    }
+}
+
+impl Deref for BooterFirmware {
+    type Target = DmaObject;
+
+    fn deref(&self) -> &Self::Target {
+        &self.ucode.0
+    }
+}
+
+impl FalconFirmware for BooterFirmware {
+    type Target = Sec2;
+}
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 1dfd085189cbf188f9cfa829eb0cb7484d9c4d62..06a7ee8f4195759fb55ad483852724bb1ab46793 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -6,6 +6,7 @@
 use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon};
 use crate::fb::FbLayout;
 use crate::fb::SysmemFlush;
+use crate::firmware::booter::{BooterFirmware, BooterKind};
 use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware};
 use crate::firmware::{Firmware, FIRMWARE_VERSION};
 use crate::gfw;
@@ -278,7 +279,7 @@ pub(crate) fn start_gsp(
         bar: &Bar0,
         chipset: Chipset,
         gsp_falcon: &Falcon<Gsp>,
-        _sec2_falcon: &Falcon<Sec2>,
+        sec2_falcon: &Falcon<Sec2>,
     ) -> Result<()> {
         let dev = pdev.as_ref();
 
@@ -289,6 +290,15 @@ pub(crate) fn start_gsp(
 
         Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
 
+        let _booter_loader = BooterFirmware::new(
+            dev,
+            BooterKind::Loader,
+            chipset,
+            FIRMWARE_VERSION,
+            sec2_falcon,
+            bar,
+        )?;
+
         // Return an empty placeholder for now, to be replaced with the GSP runtime data.
         Ok(())
     }

-- 
2.51.0
[PATCH v5 08/12] gpu: nova-core: firmware: process and prepare the GSP firmware
Posted by Alexandre Courbot 3 weeks ago
The GSP firmware is a binary blob that is verified, loaded, and run by
the GSP bootloader. Its presentation is a bit peculiar as the GSP
bootloader expects to be given a DMA address to a 3-levels page table
mapping the GSP firmware at address 0 of its own address space.

Prepare such a structure containing the DMA-mapped firmware as well as
the DMA-mapped page tables, and a way to obtain the DMA handle of the
level 0 page table.

Then, move the GSP firmware instance from the `Firmware` struct to the
`start_gsp` method since it doesn't need to be kept after the GSP is
booted.

As we are performing the required ELF section parsing and radix3 page
table building, remove these items from the TODO file.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 Documentation/gpu/nova/core/todo.rst  |  17 ---
 drivers/gpu/nova-core/firmware.rs     |   3 +-
 drivers/gpu/nova-core/firmware/gsp.rs | 232 ++++++++++++++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs          |   6 +
 drivers/gpu/nova-core/gsp.rs          |   4 +
 drivers/gpu/nova-core/nova_core.rs    |   1 +
 6 files changed, 244 insertions(+), 19 deletions(-)

diff --git a/Documentation/gpu/nova/core/todo.rst b/Documentation/gpu/nova/core/todo.rst
index 89431fec9041b1f35cc55799c91f48dc6bc918eb..0972cb905f7ae64dfbaef4808276757319009e9c 100644
--- a/Documentation/gpu/nova/core/todo.rst
+++ b/Documentation/gpu/nova/core/todo.rst
@@ -229,23 +229,6 @@ Rust abstraction for debugfs APIs.
 GPU (general)
 =============
 
-Parse firmware headers
-----------------------
-
-Parse ELF headers from the firmware files loaded from the filesystem.
-
-| Reference: ELF utils
-| Complexity: Beginner
-| Contact: Abdiel Janulgue
-
-Build radix3 page table
------------------------
-
-Build the radix3 page table to map the firmware.
-
-| Complexity: Intermediate
-| Contact: Abdiel Janulgue
-
 Initial Devinit support
 -----------------------
 
diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index d954b1e98fda82c44f87d2103e31fa717c392d79..cef910a7c2dc360139fafc2a01a050a9df40e45f 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -19,6 +19,7 @@
 
 pub(crate) mod booter;
 pub(crate) mod fwsec;
+pub(crate) mod gsp;
 
 pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
 
@@ -39,7 +40,6 @@ fn request_nv_firmware(
 #[expect(dead_code)]
 pub(crate) struct Firmware {
     bootloader: firmware::Firmware,
-    gsp: firmware::Firmware,
 }
 
 impl Firmware {
@@ -48,7 +48,6 @@ pub(crate) fn new(dev: &device::Device, chipset: Chipset, ver: &str) -> Result<F
 
         Ok(Firmware {
             bootloader: request("bootloader")?,
-            gsp: request("gsp")?,
         })
     }
 }
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
new file mode 100644
index 0000000000000000000000000000000000000000..9242e8ef6a0183cbfe5e75a3d6552ff528a2cedc
--- /dev/null
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -0,0 +1,232 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::device;
+use kernel::dma::{DataDirection, DmaAddress};
+use kernel::kvec;
+use kernel::prelude::*;
+use kernel::scatterlist::{Owned, SGTable};
+
+use crate::dma::DmaObject;
+use crate::gpu::{Architecture, Chipset};
+use crate::gsp::GSP_PAGE_SIZE;
+
+/// Ad-hoc and temporary module to extract sections from ELF images.
+///
+/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
+/// to specific and related bits of data. Future firmware versions are scheduled to move away from
+/// that scheme before nova-core becomes stable, which means this module will eventually be
+/// removed.
+mod elf {
+    use kernel::bindings;
+    use kernel::str::CStr;
+    use kernel::transmute::FromBytes;
+
+    /// Newtype to provide a [`FromBytes`] implementation.
+    #[repr(transparent)]
+    struct Elf64Hdr(bindings::elf64_hdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf64Hdr {}
+
+    #[repr(transparent)]
+    struct Elf64SHdr(bindings::elf64_shdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf64SHdr {}
+
+    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
+    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
+        let hdr = &elf
+            .get(0..size_of::<bindings::elf64_hdr>())
+            .and_then(Elf64Hdr::from_bytes)?
+            .0;
+
+        // Get all the section headers.
+        let mut shdr = {
+            let shdr_num = usize::from(hdr.e_shnum);
+            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
+            let shdr_end = shdr_num
+                .checked_mul(size_of::<Elf64SHdr>())
+                .and_then(|v| v.checked_add(shdr_start))?;
+
+            elf.get(shdr_start..shdr_end)
+                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
+        };
+
+        // Get the strings table.
+        let strhdr = shdr
+            .clone()
+            .nth(usize::from(hdr.e_shstrndx))
+            .and_then(Elf64SHdr::from_bytes)?;
+
+        // Find the section which name matches `name` and return it.
+        shdr.find(|&sh| {
+            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
+                return false;
+            };
+
+            let Some(name_idx) = strhdr
+                .0
+                .sh_offset
+                .checked_add(u64::from(hdr.0.sh_name))
+                .and_then(|idx| usize::try_from(idx).ok())
+            else {
+                return false;
+            };
+
+            // Get the start of the name.
+            elf.get(name_idx..)
+                // Stop at the first `0`.
+                .and_then(|nstr| nstr.get(0..=nstr.iter().position(|b| *b == 0)?))
+                // Convert into CStr. This should never fail because of the line above.
+                .and_then(|nstr| CStr::from_bytes_with_nul(nstr).ok())
+                // Convert into str.
+                .and_then(|c_str| c_str.to_str().ok())
+                // Check that the name matches.
+                .map(|str| str == name)
+                .unwrap_or(false)
+        })
+        // Return the slice containing the section.
+        .and_then(|sh| {
+            let hdr = Elf64SHdr::from_bytes(sh)?;
+            let start = usize::try_from(hdr.0.sh_offset).ok()?;
+            let end = usize::try_from(hdr.0.sh_size)
+                .ok()
+                .and_then(|sh_size| start.checked_add(sh_size))?;
+
+            elf.get(start..end)
+        })
+    }
+}
+
+/// GSP firmware with 3-level radix page tables for the GSP bootloader.
+///
+/// The bootloader expects firmware to be mapped starting at address 0 in GSP's virtual address
+/// space:
+///
+/// ```text
+/// Level 0:  1 page, 1 entry         -> points to first level 1 page
+/// Level 1:  Multiple pages/entries  -> each entry points to a level 2 page
+/// Level 2:  Multiple pages/entries  -> each entry points to a firmware page
+/// ```
+///
+/// Each page is 4KB, each entry is 8 bytes (64-bit DMA address).
+/// Also known as "Radix3" firmware.
+#[pin_data]
+pub(crate) struct GspFirmware {
+    /// The GSP firmware inside a [`VVec`], device-mapped via a SG table.
+    #[pin]
+    fw: SGTable<Owned<VVec<u8>>>,
+    /// Level 2 page table whose entries contain DMA addresses of firmware pages.
+    #[pin]
+    level2: SGTable<Owned<VVec<u8>>>,
+    /// Level 1 page table whose entries contain DMA addresses of level 2 pages.
+    #[pin]
+    level1: SGTable<Owned<VVec<u8>>>,
+    /// Level 0 page table (single 4KB page) with one entry: DMA address of first level 1 page.
+    level0: DmaObject,
+    /// Size in bytes of the firmware contained in [`Self::fw`].
+    pub size: usize,
+    /// Device-mapped GSP signatures matching the GPU's [`Chipset`].
+    signatures: DmaObject,
+}
+
+impl GspFirmware {
+    /// Loads the GSP firmware binaries, map them into `dev`'s address-space, and creates the page
+    /// tables expected by the GSP bootloader to load it.
+    pub(crate) fn new<'a, 'b>(
+        dev: &'a device::Device<device::Bound>,
+        chipset: Chipset,
+        ver: &'b str,
+    ) -> Result<impl PinInit<Self, Error> + 'a> {
+        let fw = super::request_nv_firmware(dev, chipset, "gsp", ver)?;
+
+        let fw_section = elf::elf64_section(fw.data(), ".fwimage").ok_or(EINVAL)?;
+
+        let sigs_section = match chipset.arch() {
+            Architecture::Ampere => ".fwsignature_ga10x",
+            _ => return Err(ENOTSUPP),
+        };
+        let signatures = elf::elf64_section(fw.data(), sigs_section)
+            .ok_or(EINVAL)
+            .and_then(|data| DmaObject::from_data(dev, data))?;
+
+        let size = fw_section.len();
+
+        // Move the firmware into a vmalloc'd vector and map it into the device address
+        // space.
+        let fw_vvec = VVec::with_capacity(fw_section.len(), GFP_KERNEL)
+            .and_then(|mut v| {
+                v.extend_from_slice(fw_section, GFP_KERNEL)?;
+                Ok(v)
+            })
+            .map_err(|_| ENOMEM)?;
+
+        Ok(try_pin_init!(Self {
+            fw <- SGTable::new(dev, fw_vvec, DataDirection::ToDevice, GFP_KERNEL),
+            level2 <- {
+                // Allocate the level 2 page table, map the firmware onto it, and map it into the
+                // device address space.
+                VVec::<u8>::with_capacity(
+                    fw.iter().count() * core::mem::size_of::<u64>(),
+                    GFP_KERNEL,
+                )
+                .map_err(|_| ENOMEM)
+                .and_then(|level2| map_into_lvl(&fw, level2))
+                .map(|level2| SGTable::new(dev, level2, DataDirection::ToDevice, GFP_KERNEL))?
+            },
+            level1 <- {
+                // Allocate the level 1 page table, map the level 2 page table onto it, and map it
+                // into the device address space.
+                VVec::<u8>::with_capacity(
+                    level2.iter().count() * core::mem::size_of::<u64>(),
+                    GFP_KERNEL,
+                )
+                .map_err(|_| ENOMEM)
+                .and_then(|level1| map_into_lvl(&level2, level1))
+                .map(|level1| SGTable::new(dev, level1, DataDirection::ToDevice, GFP_KERNEL))?
+            },
+            level0: {
+                // Allocate the level 0 page table as a device-visible DMA object, and map the
+                // level 1 page table onto it.
+
+                // Level 0 page table data.
+                let mut level0_data = kvec![0u8; GSP_PAGE_SIZE]?;
+
+                // Fill level 1 page entry.
+                #[allow(clippy::useless_conversion)]
+                let level1_entry = u64::from(level1.iter().next().unwrap().dma_address());
+                let dst = &mut level0_data[..size_of_val(&level1_entry)];
+                dst.copy_from_slice(&level1_entry.to_le_bytes());
+
+                // Turn the level0 page table into a [`DmaObject`].
+                DmaObject::from_data(dev, &level0_data)?
+            },
+            size,
+            signatures,
+        }))
+    }
+
+    #[expect(unused)]
+    /// Returns the DMA handle of the radix3 level 0 page table.
+    pub(crate) fn radix3_dma_handle(&self) -> DmaAddress {
+        self.level0.dma_handle()
+    }
+}
+
+/// Build a page table from a scatter-gather list.
+///
+/// Takes each DMA-mapped region from `sg_table` and writes page table entries
+/// for all 4KB pages within that region. For example, a 16KB SG entry becomes
+/// 4 consecutive page table entries.
+fn map_into_lvl(sg_table: &SGTable<Owned<VVec<u8>>>, mut dst: VVec<u8>) -> Result<VVec<u8>> {
+    for sg_entry in sg_table.iter() {
+        // Number of pages we need to map.
+        let num_pages = (sg_entry.dma_len() as usize).div_ceil(GSP_PAGE_SIZE);
+
+        for i in 0..num_pages {
+            let entry = sg_entry.dma_address() + (i as u64 * GSP_PAGE_SIZE as u64);
+            dst.extend_from_slice(&entry.to_le_bytes(), GFP_KERNEL)?;
+        }
+    }
+
+    Ok(dst)
+}
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 06a7ee8f4195759fb55ad483852724bb1ab46793..8505ee81c43e7628d1f099aff285244f8908c633 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -8,6 +8,7 @@
 use crate::fb::SysmemFlush;
 use crate::firmware::booter::{BooterFirmware, BooterKind};
 use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware};
+use crate::firmware::gsp::GspFirmware;
 use crate::firmware::{Firmware, FIRMWARE_VERSION};
 use crate::gfw;
 use crate::regs;
@@ -285,6 +286,11 @@ pub(crate) fn start_gsp(
 
         let bios = Vbios::new(dev, bar)?;
 
+        let _gsp_fw = KBox::pin_init(
+            GspFirmware::new(dev, chipset, FIRMWARE_VERSION)?,
+            GFP_KERNEL,
+        )?;
+
         let fb_layout = FbLayout::new(chipset, bar)?;
         dev_dbg!(dev, "{:#x?}\n", fb_layout);
 
diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
new file mode 100644
index 0000000000000000000000000000000000000000..a0e7ec5f6c9c959d57540b3ebf4b782f2e002b08
--- /dev/null
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -0,0 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
+
+pub(crate) const GSP_PAGE_SHIFT: usize = 12;
+pub(crate) const GSP_PAGE_SIZE: usize = 1 << GSP_PAGE_SHIFT;
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index cb2bbb30cba142265b354c9acf70349a6e40759e..fffcaee2249fe6cd7f55a7291c1e44be42e791d9 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -9,6 +9,7 @@
 mod firmware;
 mod gfw;
 mod gpu;
+mod gsp;
 mod regs;
 mod util;
 mod vbios;

-- 
2.51.0
Re: [PATCH v5 08/12] gpu: nova-core: firmware: process and prepare the GSP firmware
Posted by Danilo Krummrich 3 weeks ago
On 9/11/25 1:04 PM, Alexandre Courbot wrote:
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index 06a7ee8f4195759fb55ad483852724bb1ab46793..8505ee81c43e7628d1f099aff285244f8908c633 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -8,6 +8,7 @@
>  use crate::fb::SysmemFlush;
>  use crate::firmware::booter::{BooterFirmware, BooterKind};
>  use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware};
> +use crate::firmware::gsp::GspFirmware;
>  use crate::firmware::{Firmware, FIRMWARE_VERSION};
>  use crate::gfw;
>  use crate::regs;
> @@ -285,6 +286,11 @@ pub(crate) fn start_gsp(
>  
>          let bios = Vbios::new(dev, bar)?;
>  
> +        let _gsp_fw = KBox::pin_init(
> +            GspFirmware::new(dev, chipset, FIRMWARE_VERSION)?,
> +            GFP_KERNEL,
> +        )?;

Since we now have the infrastructure in place and the change is trival, I'd
prefer to make this a member of struct Gsp and make it part of the Gpu structure
directly (without separate allocation).

Should we need dynamic dispatch in the future, it's also trivial to make it its
own allocation again, but maybe we also get around the dyn dispatch. :)
Re: [PATCH v5 08/12] gpu: nova-core: firmware: process and prepare the GSP firmware
Posted by Alexandre Courbot 3 weeks ago
On Thu Sep 11, 2025 at 8:27 PM JST, Danilo Krummrich wrote:
> On 9/11/25 1:04 PM, Alexandre Courbot wrote:
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index 06a7ee8f4195759fb55ad483852724bb1ab46793..8505ee81c43e7628d1f099aff285244f8908c633 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -8,6 +8,7 @@
>>  use crate::fb::SysmemFlush;
>>  use crate::firmware::booter::{BooterFirmware, BooterKind};
>>  use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware};
>> +use crate::firmware::gsp::GspFirmware;
>>  use crate::firmware::{Firmware, FIRMWARE_VERSION};
>>  use crate::gfw;
>>  use crate::regs;
>> @@ -285,6 +286,11 @@ pub(crate) fn start_gsp(
>>  
>>          let bios = Vbios::new(dev, bar)?;
>>  
>> +        let _gsp_fw = KBox::pin_init(
>> +            GspFirmware::new(dev, chipset, FIRMWARE_VERSION)?,
>> +            GFP_KERNEL,
>> +        )?;
>
> Since we now have the infrastructure in place and the change is trival, I'd
> prefer to make this a member of struct Gsp and make it part of the Gpu structure
> directly (without separate allocation).
>
> Should we need dynamic dispatch in the future, it's also trivial to make it its
> own allocation again, but maybe we also get around the dyn dispatch. :)

Ah, my previous talk about dynamic dispatch is a bit obsolete now that
the `Firmware` struct is gone. :) Sorry if that created confusion.

Truth is, this object is not supposed to survive `start_gsp`, and we can
dispose of it after the GSP is started as the bootloader will have
validated and copied it into the WPR region. So we don't want to store
it into `Gpu`, now or ever.

I guess we could technically store it on the stack, but IIRC I haven't
found a pin initializer for that in the kernel. And the structure might
be a little bit too big for that (several owned SGTables and a couple of
CoherentAllocations - we're talking hundreds of bytes).
Re: [PATCH v5 08/12] gpu: nova-core: firmware: process and prepare the GSP firmware
Posted by Danilo Krummrich 3 weeks ago
On 9/11/25 2:29 PM, Alexandre Courbot wrote:
> On Thu Sep 11, 2025 at 8:27 PM JST, Danilo Krummrich wrote:
>> On 9/11/25 1:04 PM, Alexandre Courbot wrote:
>>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>>> index 06a7ee8f4195759fb55ad483852724bb1ab46793..8505ee81c43e7628d1f099aff285244f8908c633 100644
>>> --- a/drivers/gpu/nova-core/gpu.rs
>>> +++ b/drivers/gpu/nova-core/gpu.rs
>>> @@ -8,6 +8,7 @@
>>>  use crate::fb::SysmemFlush;
>>>  use crate::firmware::booter::{BooterFirmware, BooterKind};
>>>  use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware};
>>> +use crate::firmware::gsp::GspFirmware;
>>>  use crate::firmware::{Firmware, FIRMWARE_VERSION};
>>>  use crate::gfw;
>>>  use crate::regs;
>>> @@ -285,6 +286,11 @@ pub(crate) fn start_gsp(
>>>  
>>>          let bios = Vbios::new(dev, bar)?;
>>>  
>>> +        let _gsp_fw = KBox::pin_init(
>>> +            GspFirmware::new(dev, chipset, FIRMWARE_VERSION)?,
>>> +            GFP_KERNEL,
>>> +        )?;
>>
>> Since we now have the infrastructure in place and the change is trival, I'd
>> prefer to make this a member of struct Gsp and make it part of the Gpu structure
>> directly (without separate allocation).
>>
>> Should we need dynamic dispatch in the future, it's also trivial to make it its
>> own allocation again, but maybe we also get around the dyn dispatch. :)
> 
> Ah, my previous talk about dynamic dispatch is a bit obsolete now that
> the `Firmware` struct is gone. :) Sorry if that created confusion.
> 
> Truth is, this object is not supposed to survive `start_gsp`, and we can
> dispose of it after the GSP is started as the bootloader will have
> validated and copied it into the WPR region. So we don't want to store
> it into `Gpu`, now or ever.

Ah, makes sense, so that's fine then.
[PATCH v5 09/12] gpu: nova-core: firmware: process the GSP bootloader
Posted by Alexandre Courbot 3 weeks ago
The GSP bootloader is a small RISC-V firmware that is loaded by Booter
onto the GSP core and is in charge of loading, validating, and starting
the actual GSP firmware.

It is a regular binary firmware file containing a specific header.
Create a type holding the DMA-mapped firmware as well as useful
information extracted from the header, and hook it into our firmware
structure for later use.

The GSP bootloader is stored into the `GspFirmware` structure, since it
is part of the GSP firmware package. This makes the `Firmware` structure
empty, so remove it.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs       | 18 +------
 drivers/gpu/nova-core/firmware/gsp.rs   |  7 +++
 drivers/gpu/nova-core/firmware/riscv.rs | 89 +++++++++++++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs            |  5 +-
 4 files changed, 98 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index cef910a7c2dc360139fafc2a01a050a9df40e45f..af7356a14def2ee92c3285878ea4de64eb48d344 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -15,11 +15,11 @@
 use crate::dma::DmaObject;
 use crate::falcon::FalconFirmware;
 use crate::gpu;
-use crate::gpu::Chipset;
 
 pub(crate) mod booter;
 pub(crate) mod fwsec;
 pub(crate) mod gsp;
+pub(crate) mod riscv;
 
 pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
 
@@ -36,22 +36,6 @@ fn request_nv_firmware(
         .and_then(|path| firmware::Firmware::request(&path, dev))
 }
 
-/// Structure encapsulating the firmware blobs required for the GPU to operate.
-#[expect(dead_code)]
-pub(crate) struct Firmware {
-    bootloader: firmware::Firmware,
-}
-
-impl Firmware {
-    pub(crate) fn new(dev: &device::Device, chipset: Chipset, ver: &str) -> Result<Firmware> {
-        let request = |name| request_nv_firmware(dev, chipset, name, ver);
-
-        Ok(Firmware {
-            bootloader: request("bootloader")?,
-        })
-    }
-}
-
 /// Structure used to describe some firmwares, notably FWSEC-FRTS.
 #[repr(C)]
 #[derive(Debug, Clone)]
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index 9242e8ef6a0183cbfe5e75a3d6552ff528a2cedc..d25a21b42fc8b3987c861965e6ea56d929570b70 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -7,6 +7,7 @@
 use kernel::scatterlist::{Owned, SGTable};
 
 use crate::dma::DmaObject;
+use crate::firmware::riscv::RiscvFirmware;
 use crate::gpu::{Architecture, Chipset};
 use crate::gsp::GSP_PAGE_SIZE;
 
@@ -127,6 +128,8 @@ pub(crate) struct GspFirmware {
     pub size: usize,
     /// Device-mapped GSP signatures matching the GPU's [`Chipset`].
     signatures: DmaObject,
+    /// GSP bootloader, verifies the GSP firmware before loading and running it.
+    bootloader: RiscvFirmware,
 }
 
 impl GspFirmware {
@@ -160,6 +163,9 @@ pub(crate) fn new<'a, 'b>(
             })
             .map_err(|_| ENOMEM)?;
 
+        let bl = super::request_nv_firmware(dev, chipset, "bootloader", ver)?;
+        let bootloader = RiscvFirmware::new(dev, &bl)?;
+
         Ok(try_pin_init!(Self {
             fw <- SGTable::new(dev, fw_vvec, DataDirection::ToDevice, GFP_KERNEL),
             level2 <- {
@@ -202,6 +208,7 @@ pub(crate) fn new<'a, 'b>(
             },
             size,
             signatures,
+            bootloader,
         }))
     }
 
diff --git a/drivers/gpu/nova-core/firmware/riscv.rs b/drivers/gpu/nova-core/firmware/riscv.rs
new file mode 100644
index 0000000000000000000000000000000000000000..b7eef29956995c49ab1466bb6a71a756731bf78a
--- /dev/null
+++ b/drivers/gpu/nova-core/firmware/riscv.rs
@@ -0,0 +1,89 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Support for firmware binaries designed to run on a RISC-V core. Such firmwares files have a
+//! dedicated header.
+
+use kernel::device;
+use kernel::firmware::Firmware;
+use kernel::prelude::*;
+use kernel::transmute::FromBytes;
+
+use crate::dma::DmaObject;
+use crate::firmware::BinFirmware;
+
+/// Descriptor for microcode running on a RISC-V core.
+#[repr(C)]
+#[derive(Debug)]
+struct RmRiscvUCodeDesc {
+    version: u32,
+    bootloader_offset: u32,
+    bootloader_size: u32,
+    bootloader_param_offset: u32,
+    bootloader_param_size: u32,
+    riscv_elf_offset: u32,
+    riscv_elf_size: u32,
+    app_version: u32,
+    manifest_offset: u32,
+    manifest_size: u32,
+    monitor_data_offset: u32,
+    monitor_data_size: u32,
+    monitor_code_offset: u32,
+    monitor_code_size: u32,
+}
+
+// SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+unsafe impl FromBytes for RmRiscvUCodeDesc {}
+
+impl RmRiscvUCodeDesc {
+    /// Interprets the header of `bin_fw` as a [`RmRiscvUCodeDesc`] and returns it.
+    ///
+    /// Fails if the header pointed at by `bin_fw` is not within the bounds of the firmware image.
+    fn new(bin_fw: &BinFirmware<'_>) -> Result<Self> {
+        let offset = bin_fw.hdr.header_offset as usize;
+
+        bin_fw
+            .fw
+            .get(offset..offset + size_of::<Self>())
+            .and_then(Self::from_bytes_copy)
+            .ok_or(EINVAL)
+    }
+}
+
+/// A parsed firmware for a RISC-V core, ready to be loaded and run.
+#[expect(unused)]
+pub(crate) struct RiscvFirmware {
+    /// Offset at which the code starts in the firmware image.
+    code_offset: u32,
+    /// Offset at which the data starts in the firmware image.
+    data_offset: u32,
+    /// Offset at which the manifest starts in the firmware image.
+    manifest_offset: u32,
+    /// Application version.
+    app_version: u32,
+    /// Device-mapped firmware image.
+    ucode: DmaObject,
+}
+
+impl RiscvFirmware {
+    /// Parses the RISC-V firmware image contained in `fw`.
+    pub(crate) fn new(dev: &device::Device<device::Bound>, fw: &Firmware) -> Result<Self> {
+        let bin_fw = BinFirmware::new(fw)?;
+
+        let riscv_desc = RmRiscvUCodeDesc::new(&bin_fw)?;
+
+        let ucode = {
+            let start = bin_fw.hdr.data_offset as usize;
+            let len = bin_fw.hdr.data_size as usize;
+
+            DmaObject::from_data(dev, fw.data().get(start..start + len).ok_or(EINVAL)?)?
+        };
+
+        Ok(Self {
+            ucode,
+            code_offset: riscv_desc.monitor_code_offset,
+            data_offset: riscv_desc.monitor_data_offset,
+            manifest_offset: riscv_desc.manifest_offset,
+            app_version: riscv_desc.app_version,
+        })
+    }
+}
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 8505ee81c43e7628d1f099aff285244f8908c633..507af10502d5d3a073f220cce0a2e5fd0cc938b2 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -9,7 +9,7 @@
 use crate::firmware::booter::{BooterFirmware, BooterKind};
 use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware};
 use crate::firmware::gsp::GspFirmware;
-use crate::firmware::{Firmware, FIRMWARE_VERSION};
+use crate::firmware::FIRMWARE_VERSION;
 use crate::gfw;
 use crate::regs;
 use crate::vbios::Vbios;
@@ -179,7 +179,6 @@ pub(crate) struct Gpu {
     spec: Spec,
     /// MMIO mapping of PCI BAR 0
     bar: Arc<Devres<Bar0>>,
-    fw: Firmware,
     /// System memory page required for flushing all pending GPU-side memory writes done through
     /// PCIE into system memory, via sysmembar (A GPU-initiated HW memory-barrier operation).
     sysmem_flush: SysmemFlush,
@@ -331,8 +330,6 @@ pub(crate) fn new<'a>(
                     .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?;
             },
 
-            fw <- Firmware::new(pdev.as_ref(), spec.chipset, FIRMWARE_VERSION)?,
-
             sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
 
             gsp_falcon: Falcon::<Gsp>::new(

-- 
2.51.0
[PATCH v5 10/12] gpu: nova-core: firmware: use 570.144 firmware
Posted by Alexandre Courbot 3 weeks ago
570.144 is the latest available into linux-firmware as of this commit,
and the one we will use to start development of nova-core. It should
eventually be dropped for a newer version before the driver becomes able
to do anything useful. The newer firmware is expected to iron out some
of the inelegances of 570.144, notably related to packaging.

Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index af7356a14def2ee92c3285878ea4de64eb48d344..9a97e97a3c7b4ac65b66e4e5f092839720ded82d 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -21,7 +21,7 @@
 pub(crate) mod gsp;
 pub(crate) mod riscv;
 
-pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
+pub(crate) const FIRMWARE_VERSION: &str = "570.144";
 
 /// Requests the GPU firmware `name` suitable for `chipset`, with version `ver`.
 fn request_nv_firmware(

-- 
2.51.0
[PATCH v5 11/12] gpu: nova-core: Add base files for r570.144 firmware bindings
Posted by Alexandre Courbot 3 weeks ago
From: Alistair Popple <apopple@nvidia.com>

Interacting with the GSP currently requires using definitions from C
header files. Rust definitions for the types needed for Nova core will
be generated using the Rust bindgen tool. This patch adds the base
module to allow inclusion of the generated bindings. The generated
bindings themselves are added by subsequent patches when they are first
used.

Currently we only intend to support a single firmware version, 570.144,
with these bindings. Longer term we intend to move to a more stable GSP
interface that isn't tied to specific firmware versions.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
[acourbot@nvidia.com: adapt the bindings module comment a bit]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/gsp.rs                      |  2 ++
 drivers/gpu/nova-core/gsp/fw.rs                   |  7 ++++++
 drivers/gpu/nova-core/gsp/fw/r570_144.rs          | 29 +++++++++++++++++++++++
 drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs |  1 +
 4 files changed, 39 insertions(+)

diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
index a0e7ec5f6c9c959d57540b3ebf4b782f2e002b08..6933848eb950e97238ab43c7b8f9e022ffe84eba 100644
--- a/drivers/gpu/nova-core/gsp.rs
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -1,4 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 
+mod fw;
+
 pub(crate) const GSP_PAGE_SHIFT: usize = 12;
 pub(crate) const GSP_PAGE_SIZE: usize = 1 << GSP_PAGE_SHIFT;
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
new file mode 100644
index 0000000000000000000000000000000000000000..34226dd009824c1e44d0cb2e37b43434a364e433
--- /dev/null
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -0,0 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
+
+mod r570_144;
+
+// Alias to avoid repeating the version number with every use.
+#[expect(unused)]
+use r570_144 as bindings;
diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144.rs b/drivers/gpu/nova-core/gsp/fw/r570_144.rs
new file mode 100644
index 0000000000000000000000000000000000000000..35cb0370a7c9b4604393931f9f3bf72ef4a794b4
--- /dev/null
+++ b/drivers/gpu/nova-core/gsp/fw/r570_144.rs
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Firmware bindings.
+//!
+//! Imports the generated bindings by `bindgen`.
+//!
+//! This module may not be directly used. Please abstract or re-export the needed symbols in the
+//! parent module instead.
+
+#![cfg_attr(test, allow(deref_nullptr))]
+#![cfg_attr(test, allow(unaligned_references))]
+#![cfg_attr(test, allow(unsafe_op_in_unsafe_fn))]
+#![allow(
+    dead_code,
+    unused_imports,
+    clippy::all,
+    clippy::undocumented_unsafe_blocks,
+    clippy::ptr_as_ptr,
+    clippy::ref_as_ptr,
+    missing_docs,
+    non_camel_case_types,
+    non_upper_case_globals,
+    non_snake_case,
+    improper_ctypes,
+    unreachable_pub,
+    unsafe_op_in_unsafe_fn
+)]
+use kernel::ffi;
+include!("r570_144/bindings.rs");
diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
new file mode 100644
index 0000000000000000000000000000000000000000..cec5940325151e407aa90128a35cb683afd436d7
--- /dev/null
+++ b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
@@ -0,0 +1 @@
+// SPDX-License-Identifier: GPL-2.0

-- 
2.51.0
[PATCH v5 12/12] gpu: nova-core: compute layout of more framebuffer regions required for GSP
Posted by Alexandre Courbot 3 weeks ago
Compute more of the required FB layout information to boot the GSP
firmware.

This information is dependent on the firmware itself, so first we need
to import and abstract the required firmware bindings in the `nvfw`
module.

Then, a new FB HAL method is introduced in `fb::hal` that uses these
bindings and hardware information to compute the correct layout
information.

This information is then used in `fb` and the result made visible in
`FbLayout`.

These 3 things are grouped into the same patch to avoid lots of unused
warnings that would be tedious to work around. As they happen in
different files, they should not be too difficult to track separately.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs                       |  65 ++++++++++-
 drivers/gpu/nova-core/firmware/gsp.rs             |   2 +-
 drivers/gpu/nova-core/firmware/riscv.rs           |   2 +-
 drivers/gpu/nova-core/gpu.rs                      |   4 +-
 drivers/gpu/nova-core/gsp.rs                      |   5 +
 drivers/gpu/nova-core/gsp/fw.rs                   |  96 ++++++++++++++++-
 drivers/gpu/nova-core/gsp/fw/r570_144.rs          |   1 -
 drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 125 ++++++++++++++++++++++
 8 files changed, 292 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 27d9edab8347c5ed3be104d62a9e32709238bb92..4d6a1f45218368dd3c0e79679f2733b5a15725e0 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -10,7 +10,9 @@
 
 use crate::dma::DmaObject;
 use crate::driver::Bar0;
+use crate::firmware::gsp::GspFirmware;
 use crate::gpu::Chipset;
+use crate::gsp;
 use crate::regs;
 
 mod hal;
@@ -87,14 +89,28 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
 #[derive(Debug)]
 #[expect(dead_code)]
 pub(crate) struct FbLayout {
+    /// Range of the framebuffer. Starts at `0`.
     pub(crate) fb: Range<u64>,
+    /// VGA workspace, small area of reserved memory at the end of the framebuffer.
     pub(crate) vga_workspace: Range<u64>,
+    /// FRTS range.
     pub(crate) frts: Range<u64>,
+    /// Memory area containing the GSP bootloader image.
+    pub(crate) boot: Range<u64>,
+    /// Memory area containing the GSP firmware image.
+    pub(crate) elf: Range<u64>,
+    /// WPR2 heap.
+    pub(crate) wpr2_heap: Range<u64>,
+    // WPR2 region range, starting with an instance of `GspFwWprMeta`.
+    pub(crate) wpr2: Range<u64>,
+    pub(crate) heap: Range<u64>,
+    pub(crate) vf_partition_count: u8,
 }
 
 impl FbLayout {
-    /// Computes the FB layout.
-    pub(crate) fn new(chipset: Chipset, bar: &Bar0) -> Result<Self> {
+    /// Computes the FB layout for `chipset`, for running the `bl` GSP bootloader and `gsp` GSP
+    /// firmware.
+    pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<Self> {
         let hal = hal::fb_hal(chipset);
 
         let fb = {
@@ -138,10 +154,55 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0) -> Result<Self> {
             frts_base..frts_base + FRTS_SIZE
         };
 
+        let boot = {
+            const BOOTLOADER_DOWN_ALIGN: Alignment = Alignment::new::<SZ_4K>();
+            let bootloader_size = gsp_fw.bootloader.ucode.size() as u64;
+            let bootloader_base = (frts.start - bootloader_size).align_down(BOOTLOADER_DOWN_ALIGN);
+
+            bootloader_base..bootloader_base + bootloader_size
+        };
+
+        let elf = {
+            const ELF_DOWN_ALIGN: Alignment = Alignment::new::<SZ_64K>();
+            let elf_size = gsp_fw.size as u64;
+            let elf_addr = (boot.start - elf_size).align_down(ELF_DOWN_ALIGN);
+
+            elf_addr..elf_addr + elf_size
+        };
+
+        let wpr2_heap = {
+            const WPR2_HEAP_DOWN_ALIGN: Alignment = Alignment::new::<SZ_1M>();
+            let wpr2_heap_size =
+                crate::gsp::LibosParams::from_chipset(chipset).wpr_heap_size(chipset, fb.end);
+            let wpr2_heap_addr = (elf.start - wpr2_heap_size).align_down(WPR2_HEAP_DOWN_ALIGN);
+
+            wpr2_heap_addr..(elf.start).align_down(WPR2_HEAP_DOWN_ALIGN)
+        };
+
+        let wpr2 = {
+            const WPR2_DOWN_ALIGN: Alignment = Alignment::new::<SZ_1M>();
+            let wpr2_addr = (wpr2_heap.start - size_of::<gsp::GspFwWprMeta>() as u64)
+                .align_down(WPR2_DOWN_ALIGN);
+
+            wpr2_addr..frts.end
+        };
+
+        let heap = {
+            const HEAP_SIZE: u64 = SZ_1M as u64;
+
+            wpr2.start - HEAP_SIZE..wpr2.start
+        };
+
         Ok(Self {
             fb,
             vga_workspace,
             frts,
+            boot,
+            elf,
+            wpr2_heap,
+            wpr2,
+            heap,
+            vf_partition_count: 0,
         })
     }
 }
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index d25a21b42fc8b3987c861965e6ea56d929570b70..46b21385cbd623b5feea37a3ab5dfe0a90258155 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -129,7 +129,7 @@ pub(crate) struct GspFirmware {
     /// Device-mapped GSP signatures matching the GPU's [`Chipset`].
     signatures: DmaObject,
     /// GSP bootloader, verifies the GSP firmware before loading and running it.
-    bootloader: RiscvFirmware,
+    pub bootloader: RiscvFirmware,
 }
 
 impl GspFirmware {
diff --git a/drivers/gpu/nova-core/firmware/riscv.rs b/drivers/gpu/nova-core/firmware/riscv.rs
index b7eef29956995c49ab1466bb6a71a756731bf78a..b90acfc81e7898ed1726430001d31ebbc976f9f1 100644
--- a/drivers/gpu/nova-core/firmware/riscv.rs
+++ b/drivers/gpu/nova-core/firmware/riscv.rs
@@ -61,7 +61,7 @@ pub(crate) struct RiscvFirmware {
     /// Application version.
     app_version: u32,
     /// Device-mapped firmware image.
-    ucode: DmaObject,
+    pub ucode: DmaObject,
 }
 
 impl RiscvFirmware {
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 507af10502d5d3a073f220cce0a2e5fd0cc938b2..76e8953d70637cc9e27165fd0d97e715934e10f2 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -285,12 +285,12 @@ pub(crate) fn start_gsp(
 
         let bios = Vbios::new(dev, bar)?;
 
-        let _gsp_fw = KBox::pin_init(
+        let gsp_fw = KBox::pin_init(
             GspFirmware::new(dev, chipset, FIRMWARE_VERSION)?,
             GFP_KERNEL,
         )?;
 
-        let fb_layout = FbLayout::new(chipset, bar)?;
+        let fb_layout = FbLayout::new(chipset, bar, &gsp_fw)?;
         dev_dbg!(dev, "{:#x?}\n", fb_layout);
 
         Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
index 6933848eb950e97238ab43c7b8f9e022ffe84eba..6db9892b711bb5f6de29c8e0a6bc5c361827ee4f 100644
--- a/drivers/gpu/nova-core/gsp.rs
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -2,5 +2,10 @@
 
 mod fw;
 
+pub(crate) use fw::{GspFwWprMeta, LibosParams};
+
+use kernel::ptr::Alignment;
+
 pub(crate) const GSP_PAGE_SHIFT: usize = 12;
 pub(crate) const GSP_PAGE_SIZE: usize = 1 << GSP_PAGE_SHIFT;
+pub(crate) const GSP_HEAP_ALIGNMENT: Alignment = Alignment::new::<{ 1 << 20 }>();
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 34226dd009824c1e44d0cb2e37b43434a364e433..181baa4017705c65adfc1ad0a8454d592f9c69da 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -3,5 +3,99 @@
 mod r570_144;
 
 // Alias to avoid repeating the version number with every use.
-#[expect(unused)]
 use r570_144 as bindings;
+
+use core::ops::Range;
+
+use kernel::ptr::Alignable;
+use kernel::sizes::SZ_1M;
+
+use crate::gpu::Chipset;
+use crate::gsp;
+
+/// Dummy type to group methods related to heap parameters for running the GSP firmware.
+pub(crate) struct GspFwHeapParams(());
+
+impl GspFwHeapParams {
+    /// Returns the amount of GSP-RM heap memory used during GSP-RM boot and initialization (up to
+    /// and including the first client subdevice allocation).
+    fn base_rm_size(_chipset: Chipset) -> u64 {
+        // TODO: this needs to be updated to return the correct value for Hopper+ once support for
+        // them is added:
+        // u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100)
+        u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X)
+    }
+
+    /// Returns the amount of heap memory required to support a single channel allocation.
+    fn client_alloc_size() -> u64 {
+        u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
+            .align_up(gsp::GSP_HEAP_ALIGNMENT)
+            .unwrap_or(u64::MAX)
+    }
+
+    /// Returns the amount of memory to reserve for management purposes for a framebuffer of size
+    /// `fb_size`.
+    fn management_overhead(fb_size: u64) -> u64 {
+        let fb_size_gb = fb_size.div_ceil(kernel::sizes::SZ_1G as u64);
+
+        u64::from(bindings::GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB)
+            .saturating_mul(fb_size_gb)
+            .align_up(gsp::GSP_HEAP_ALIGNMENT)
+            .unwrap_or(u64::MAX)
+    }
+}
+
+/// Heap memory requirements and constraints for a given version of the GSP LIBOS.
+pub(crate) struct LibosParams {
+    /// The base amount of heap required by the GSP operating system, in bytes.
+    carveout_size: u64,
+    /// The minimum and maximum sizes allowed for the GSP FW heap, in bytes.
+    allowed_heap_size: Range<u64>,
+}
+
+impl LibosParams {
+    /// Version 2 of the GSP LIBOS (Turing and GA100)
+    const LIBOS2: LibosParams = LibosParams {
+        carveout_size: bindings::GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS2 as u64,
+        allowed_heap_size: bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS2_MIN_MB as u64 * SZ_1M as u64
+            ..bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS2_MAX_MB as u64 * SZ_1M as u64,
+    };
+
+    /// Version 3 of the GSP LIBOS (GA102+)
+    const LIBOS3: LibosParams = LibosParams {
+        carveout_size: bindings::GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS3_BAREMETAL as u64,
+        allowed_heap_size: bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB as u64
+            * SZ_1M as u64
+            ..bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MAX_MB as u64 * SZ_1M as u64,
+    };
+
+    /// Returns the libos parameters corresponding to `chipset`.
+    pub(crate) fn from_chipset(chipset: Chipset) -> &'static LibosParams {
+        if chipset < Chipset::GA102 {
+            &Self::LIBOS2
+        } else {
+            &Self::LIBOS3
+        }
+    }
+
+    /// Returns the amount of memory (in bytes) to allocate for the WPR heap for a framebuffer size
+    /// of `fb_size` (in bytes) for `chipset`.
+    pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> u64 {
+        // The WPR heap will contain the following:
+        // LIBOS carveout,
+        self.carveout_size
+            // RM boot working memory,
+            .saturating_add(GspFwHeapParams::base_rm_size(chipset))
+            // One RM client,
+            .saturating_add(GspFwHeapParams::client_alloc_size())
+            // Overhead for memory management.
+            .saturating_add(GspFwHeapParams::management_overhead(fb_size))
+            // Clamp to the supported heap sizes.
+            .clamp(self.allowed_heap_size.start, self.allowed_heap_size.end - 1)
+    }
+}
+
+/// Structure passed to the GSP bootloader, containing the framebuffer layout as well as the DMA
+/// addresses of the GSP bootloader and firmware.
+#[repr(transparent)]
+pub(crate) struct GspFwWprMeta(bindings::GspFwWprMeta);
diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144.rs b/drivers/gpu/nova-core/gsp/fw/r570_144.rs
index 35cb0370a7c9b4604393931f9f3bf72ef4a794b4..82a973cd99c38eee39a0554e855a60e61dba2d01 100644
--- a/drivers/gpu/nova-core/gsp/fw/r570_144.rs
+++ b/drivers/gpu/nova-core/gsp/fw/r570_144.rs
@@ -12,7 +12,6 @@
 #![cfg_attr(test, allow(unsafe_op_in_unsafe_fn))]
 #![allow(
     dead_code,
-    unused_imports,
     clippy::all,
     clippy::undocumented_unsafe_blocks,
     clippy::ptr_as_ptr,
diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
index cec5940325151e407aa90128a35cb683afd436d7..0407000cca2296e713cc4701b635718fe51488cb 100644
--- a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
+++ b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
@@ -1 +1,126 @@
 // SPDX-License-Identifier: GPL-2.0
+
+pub const GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS2: u32 = 0;
+pub const GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS3_BAREMETAL: u32 = 23068672;
+pub const GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X: u32 = 8388608;
+pub const GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB: u32 = 98304;
+pub const GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE: u32 = 100663296;
+pub const GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS2_MIN_MB: u32 = 64;
+pub const GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS2_MAX_MB: u32 = 256;
+pub const GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB: u32 = 88;
+pub const GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MAX_MB: u32 = 280;
+pub type __u8 = ffi::c_uchar;
+pub type __u16 = ffi::c_ushort;
+pub type __u32 = ffi::c_uint;
+pub type __u64 = ffi::c_ulonglong;
+pub type u8_ = __u8;
+pub type u16_ = __u16;
+pub type u32_ = __u32;
+pub type u64_ = __u64;
+#[repr(C)]
+#[derive(Copy, Clone)]
+pub struct GspFwWprMeta {
+    pub magic: u64_,
+    pub revision: u64_,
+    pub sysmemAddrOfRadix3Elf: u64_,
+    pub sizeOfRadix3Elf: u64_,
+    pub sysmemAddrOfBootloader: u64_,
+    pub sizeOfBootloader: u64_,
+    pub bootloaderCodeOffset: u64_,
+    pub bootloaderDataOffset: u64_,
+    pub bootloaderManifestOffset: u64_,
+    pub __bindgen_anon_1: GspFwWprMeta__bindgen_ty_1,
+    pub gspFwRsvdStart: u64_,
+    pub nonWprHeapOffset: u64_,
+    pub nonWprHeapSize: u64_,
+    pub gspFwWprStart: u64_,
+    pub gspFwHeapOffset: u64_,
+    pub gspFwHeapSize: u64_,
+    pub gspFwOffset: u64_,
+    pub bootBinOffset: u64_,
+    pub frtsOffset: u64_,
+    pub frtsSize: u64_,
+    pub gspFwWprEnd: u64_,
+    pub fbSize: u64_,
+    pub vgaWorkspaceOffset: u64_,
+    pub vgaWorkspaceSize: u64_,
+    pub bootCount: u64_,
+    pub __bindgen_anon_2: GspFwWprMeta__bindgen_ty_2,
+    pub gspFwHeapVfPartitionCount: u8_,
+    pub flags: u8_,
+    pub padding: [u8_; 2usize],
+    pub pmuReservedSize: u32_,
+    pub verified: u64_,
+}
+#[repr(C)]
+#[derive(Copy, Clone)]
+pub union GspFwWprMeta__bindgen_ty_1 {
+    pub __bindgen_anon_1: GspFwWprMeta__bindgen_ty_1__bindgen_ty_1,
+    pub __bindgen_anon_2: GspFwWprMeta__bindgen_ty_1__bindgen_ty_2,
+}
+#[repr(C)]
+#[derive(Debug, Default, Copy, Clone)]
+pub struct GspFwWprMeta__bindgen_ty_1__bindgen_ty_1 {
+    pub sysmemAddrOfSignature: u64_,
+    pub sizeOfSignature: u64_,
+}
+#[repr(C)]
+#[derive(Debug, Default, Copy, Clone)]
+pub struct GspFwWprMeta__bindgen_ty_1__bindgen_ty_2 {
+    pub gspFwHeapFreeListWprOffset: u32_,
+    pub unused0: u32_,
+    pub unused1: u64_,
+}
+impl Default for GspFwWprMeta__bindgen_ty_1 {
+    fn default() -> Self {
+        let mut s = ::core::mem::MaybeUninit::<Self>::uninit();
+        unsafe {
+            ::core::ptr::write_bytes(s.as_mut_ptr(), 0, 1);
+            s.assume_init()
+        }
+    }
+}
+#[repr(C)]
+#[derive(Copy, Clone)]
+pub union GspFwWprMeta__bindgen_ty_2 {
+    pub __bindgen_anon_1: GspFwWprMeta__bindgen_ty_2__bindgen_ty_1,
+    pub __bindgen_anon_2: GspFwWprMeta__bindgen_ty_2__bindgen_ty_2,
+}
+#[repr(C)]
+#[derive(Debug, Default, Copy, Clone)]
+pub struct GspFwWprMeta__bindgen_ty_2__bindgen_ty_1 {
+    pub partitionRpcAddr: u64_,
+    pub partitionRpcRequestOffset: u16_,
+    pub partitionRpcReplyOffset: u16_,
+    pub elfCodeOffset: u32_,
+    pub elfDataOffset: u32_,
+    pub elfCodeSize: u32_,
+    pub elfDataSize: u32_,
+    pub lsUcodeVersion: u32_,
+}
+#[repr(C)]
+#[derive(Debug, Default, Copy, Clone)]
+pub struct GspFwWprMeta__bindgen_ty_2__bindgen_ty_2 {
+    pub partitionRpcPadding: [u32_; 4usize],
+    pub sysmemAddrOfCrashReportQueue: u64_,
+    pub sizeOfCrashReportQueue: u32_,
+    pub lsUcodeVersionPadding: [u32_; 1usize],
+}
+impl Default for GspFwWprMeta__bindgen_ty_2 {
+    fn default() -> Self {
+        let mut s = ::core::mem::MaybeUninit::<Self>::uninit();
+        unsafe {
+            ::core::ptr::write_bytes(s.as_mut_ptr(), 0, 1);
+            s.assume_init()
+        }
+    }
+}
+impl Default for GspFwWprMeta {
+    fn default() -> Self {
+        let mut s = ::core::mem::MaybeUninit::<Self>::uninit();
+        unsafe {
+            ::core::ptr::write_bytes(s.as_mut_ptr(), 0, 1);
+            s.assume_init()
+        }
+    }
+}

-- 
2.51.0