[PATCH v9 24/31] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot

John Hubbard posted 31 patches 1 week ago
[PATCH v9 24/31] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
Posted by John Hubbard 1 week ago
Add boot_fmc() which builds and sends the Chain of Trust message to FSP,
and FmcBootArgs which bundles the DMA-coherent boot parameters that FSP
reads at boot time. The FspFirmware struct fields become pub(crate) and
fmc_full changes from DmaObject to KVec<u8> for CPU-side signature
extraction.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/fsp.rs |   8 +-
 drivers/gpu/nova-core/fsp.rs          | 170 +++++++++++++++++++++++++-
 drivers/gpu/nova-core/gpu.rs          |   1 -
 drivers/gpu/nova-core/mctp.rs         |   7 --
 4 files changed, 172 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
index 028f651553e0..5bd15b644825 100644
--- a/drivers/gpu/nova-core/firmware/fsp.rs
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -14,16 +14,16 @@
     gpu::Chipset, //
 };
 
-#[expect(unused)]
+#[expect(dead_code)]
 pub(crate) struct FspFirmware {
     /// FMC firmware image data (only the "image" ELF section).
-    fmc_image: DmaObject,
-    /// Full FMC ELF (for signature extraction.
+    pub(crate) fmc_image: DmaObject,
+    /// Full FMC ELF for signature extraction.
     pub(crate) fmc_elf: Firmware,
 }
 
 impl FspFirmware {
-    #[expect(unused)]
+    #[expect(dead_code)]
     pub(crate) fn new(
         dev: &device::Device<device::Bound>,
         chipset: Chipset,
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index db07fc227aa1..6edbc4fe7066 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -8,8 +8,20 @@
 
 use kernel::{
     device,
+    dma::{
+        Coherent,
+        CoherentBox, //
+    },
     io::poll::read_poll_timeout,
     prelude::*,
+    ptr::{
+        Alignable,
+        Alignment, //
+    },
+    sizes::{
+        SZ_1M,
+        SZ_2M, //
+    },
     time::Delta,
     transmute::{
         AsBytes,
@@ -38,7 +50,6 @@ pub(crate) const fn new(version: u16) -> Self {
     }
 
     /// Return the raw protocol version number for the wire format.
-    #[expect(dead_code)]
     pub(crate) const fn raw(self) -> u16 {
         self.0
     }
@@ -163,6 +174,35 @@ struct NvdmPayloadCommandResponse {
     error_code: u32,
 }
 
+/// NVDM (NVIDIA Device Management) COT (Chain of Trust) payload structure.
+/// This is the main message payload sent to FSP for Chain of Trust.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct NvdmPayloadCot {
+    version: u16,
+    size: u16,
+    gsp_fmc_sysmem_offset: u64,
+    frts_sysmem_offset: u64,
+    frts_sysmem_size: u32,
+    frts_vidmem_offset: u64,
+    frts_vidmem_size: u32,
+    hash384: [u8; FSP_HASH_SIZE],
+    public_key: [u8; FSP_PKEY_SIZE],
+    signature: [u8; FSP_SIG_SIZE],
+    gsp_boot_args_sysmem_offset: u64,
+}
+
+/// Complete FSP message structure with MCTP and NVDM headers.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct FspMessage {
+    mctp_header: u32,
+    nvdm_header: u32,
+    cot: NvdmPayloadCot,
+}
+
+// SAFETY: FspMessage is a packed C struct with only integral fields.
+unsafe impl AsBytes for FspMessage {}
 /// Complete FSP response structure with MCTP and NVDM headers.
 #[repr(C, packed)]
 #[derive(Clone, Copy)]
@@ -183,6 +223,74 @@ pub(crate) trait MessageToFsp: AsBytes {
     /// NVDM type identifying this message to FSP.
     const NVDM_TYPE: u32;
 }
+
+impl MessageToFsp for FspMessage {
+    const NVDM_TYPE: u32 = NvdmType::Cot as u32;
+}
+
+/// Bundled arguments for FMC boot via FSP Chain of Trust.
+pub(crate) struct FmcBootArgs<'a> {
+    chipset: crate::gpu::Chipset,
+    fmc_image_fw: &'a crate::dma::DmaObject,
+    fmc_boot_params: Coherent<GspFmcBootParams>,
+    resume: bool,
+    signatures: &'a FmcSignatures,
+}
+
+impl<'a> FmcBootArgs<'a> {
+    /// Build FMC boot arguments, allocating the DMA-coherent boot parameter
+    /// structure that FSP will read.
+    #[expect(dead_code)]
+    #[allow(clippy::too_many_arguments)]
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        chipset: crate::gpu::Chipset,
+        fmc_image_fw: &'a crate::dma::DmaObject,
+        wpr_meta_addr: u64,
+        wpr_meta_size: u32,
+        libos_addr: u64,
+        resume: bool,
+        signatures: &'a FmcSignatures,
+    ) -> Result<Self> {
+        // `GSP_DMA_TARGET_*` is not in the current Rust bindings yet.
+        const GSP_DMA_TARGET_COHERENT_SYSTEM: u32 = 1;
+        const GSP_DMA_TARGET_NONCOHERENT_SYSTEM: u32 = 2;
+
+        let mut fmc_boot_params = CoherentBox::<GspFmcBootParams>::zeroed(dev, GFP_KERNEL)?;
+
+        // Blackwell FSP expects wpr_carveout_offset and wpr_carveout_size to be zero;
+        // it obtains WPR info from other sources.
+        fmc_boot_params.boot_gsp_rm_params = GspAcrBootGspRmParams {
+            target: GSP_DMA_TARGET_COHERENT_SYSTEM,
+            gsp_rm_desc_size: wpr_meta_size,
+            gsp_rm_desc_offset: wpr_meta_addr,
+            b_is_gsp_rm_boot: 1,
+            ..Default::default()
+        };
+
+        fmc_boot_params.gsp_rm_params = GspRmParams {
+            target: GSP_DMA_TARGET_NONCOHERENT_SYSTEM,
+            boot_args_offset: libos_addr,
+        };
+
+        let fmc_boot_params: Coherent<GspFmcBootParams> = fmc_boot_params.into();
+
+        Ok(Self {
+            chipset,
+            fmc_image_fw,
+            fmc_boot_params,
+            resume,
+            signatures,
+        })
+    }
+
+    /// DMA address of the FMC boot parameters, needed after boot for lockdown
+    /// release polling.
+    #[expect(dead_code)]
+    pub(crate) fn boot_params_dma_handle(&self) -> u64 {
+        self.fmc_boot_params.dma_handle()
+    }
+}
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
@@ -284,8 +392,66 @@ pub(crate) fn extract_fmc_signatures(
         Ok(signatures)
     }
 
-    /// Send message to FSP and wait for response.
+    /// Boot GSP FMC via FSP Chain of Trust.
+    ///
+    /// Builds the COT message from the pre-configured [`FmcBootArgs`], sends it
+    /// to FSP, and waits for the response.
     #[expect(dead_code)]
+    pub(crate) fn boot_fmc(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
+        args: &FmcBootArgs<'_>,
+    ) -> Result {
+        dev_dbg!(dev, "Starting FSP boot sequence for {}\n", args.chipset);
+
+        let fmc_addr = args.fmc_image_fw.dma_handle();
+        let fmc_boot_params_addr = args.fmc_boot_params.dma_handle();
+
+        // frts_offset is relative to FB end: FRTS_location = FB_END - frts_offset
+        let frts_offset = if !args.resume {
+            let frts_reserved_size = crate::fb::calc_non_wpr_heap_size(args.chipset)
+                .checked_add(u64::from(crate::fb::PMU_RESERVED_SIZE))
+                .ok_or(EINVAL)?;
+
+            frts_reserved_size
+                .align_up(Alignment::new::<SZ_2M>())
+                .ok_or(EINVAL)?
+        } else {
+            0
+        };
+        let frts_size: u32 = if !args.resume { SZ_1M as u32 } else { 0 };
+
+        let msg = KBox::new(
+            FspMessage {
+                mctp_header: MctpHeader::single_packet().raw(),
+                nvdm_header: NvdmHeader::new(NvdmType::Cot).raw(),
+
+                cot: NvdmPayloadCot {
+                    version: args.chipset.fsp_cot_version().ok_or(ENOTSUPP)?.raw(),
+                    size: u16::try_from(core::mem::size_of::<NvdmPayloadCot>())
+                        .map_err(|_| EINVAL)?,
+                    gsp_fmc_sysmem_offset: fmc_addr,
+                    frts_sysmem_offset: 0,
+                    frts_sysmem_size: 0,
+                    frts_vidmem_offset: frts_offset,
+                    frts_vidmem_size: frts_size,
+                    hash384: args.signatures.hash384,
+                    public_key: args.signatures.public_key,
+                    signature: args.signatures.signature,
+                    gsp_boot_args_sysmem_offset: fmc_boot_params_addr,
+                },
+            },
+            GFP_KERNEL,
+        )?;
+
+        Self::send_sync_fsp(dev, bar, fsp_falcon, &*msg)?;
+
+        dev_dbg!(dev, "FSP Chain of Trust completed successfully\n");
+        Ok(())
+    }
+
+    /// Send message to FSP and wait for response.
     fn send_sync_fsp<M>(
         dev: &device::Device<device::Bound>,
         bar: &crate::driver::Bar0,
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index ab378c0297c7..780a4d1169e2 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -138,7 +138,6 @@ pub(crate) const fn needs_fwsec_bootloader(self) -> bool {
     ///
     /// Hopper (GH100) uses version 1, Blackwell uses version 2.
     /// Returns `None` for architectures that do not use FSP.
-    #[expect(dead_code)]
     pub(crate) const fn fsp_cot_version(&self) -> Option<FspCotVersion> {
         match self.arch() {
             Architecture::Hopper => Some(FspCotVersion::new(1)),
diff --git a/drivers/gpu/nova-core/mctp.rs b/drivers/gpu/nova-core/mctp.rs
index 9e052d916e79..c23e8ec69636 100644
--- a/drivers/gpu/nova-core/mctp.rs
+++ b/drivers/gpu/nova-core/mctp.rs
@@ -6,8 +6,6 @@
 //! Device Management) messages between the kernel driver and GPU firmware
 //! processors such as FSP and GSP.
 
-#![expect(dead_code)]
-
 /// NVDM message type identifiers carried over MCTP.
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
 #[repr(u8)]
@@ -101,11 +99,6 @@ pub(crate) fn nvdm_type(self) -> core::result::Result<NvdmType, u8> {
         NvdmType::try_from(self.raw_nvdm_type())
     }
 
-    /// Extract the NVDM type field as a raw value.
-    pub(crate) fn nvdm_type_raw(self) -> u32 {
-        u32::from(self.raw_nvdm_type())
-    }
-
     /// Set the NVDM type field from a typed value.
     pub(crate) fn set_nvdm_type(self, nvdm_type: NvdmType) -> Self {
         self.set_raw_nvdm_type(u8::from(nvdm_type))
-- 
2.53.0
Re: [PATCH v9 24/31] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
Posted by Alexandre Courbot 2 days, 10 hours ago
On Thu Mar 26, 2026 at 10:38 AM JST, John Hubbard wrote:
> Add boot_fmc() which builds and sends the Chain of Trust message to FSP,
> and FmcBootArgs which bundles the DMA-coherent boot parameters that FSP
> reads at boot time. The FspFirmware struct fields become pub(crate) and
> fmc_full changes from DmaObject to KVec<u8> for CPU-side signature
> extraction.
>
> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/firmware/fsp.rs |   8 +-
>  drivers/gpu/nova-core/fsp.rs          | 170 +++++++++++++++++++++++++-
>  drivers/gpu/nova-core/gpu.rs          |   1 -
>  drivers/gpu/nova-core/mctp.rs         |   7 --
>  4 files changed, 172 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
> index 028f651553e0..5bd15b644825 100644
> --- a/drivers/gpu/nova-core/firmware/fsp.rs
> +++ b/drivers/gpu/nova-core/firmware/fsp.rs
> @@ -14,16 +14,16 @@
>      gpu::Chipset, //
>  };
>  
> -#[expect(unused)]
> +#[expect(dead_code)]

Why?

>  pub(crate) struct FspFirmware {
>      /// FMC firmware image data (only the "image" ELF section).
> -    fmc_image: DmaObject,
> -    /// Full FMC ELF (for signature extraction.
> +    pub(crate) fmc_image: DmaObject,
> +    /// Full FMC ELF for signature extraction.
>      pub(crate) fmc_elf: Firmware,

Let's make the comment for `fmc_elf` correct since its introduction
instead of fixing it here?

>  }
>  
>  impl FspFirmware {
> -    #[expect(unused)]
> +    #[expect(dead_code)]

Same here, I don't see the point. If it's not unused anymore, let's
remove the `expect` entirely. If it is, let's keep the more accurate
`unused`.

Btw, this is the kind of stuff you'd normally catch by looking at the
diff before sending it. I shouldn't have to do this.
Re: [PATCH v9 24/31] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
Posted by John Hubbard 2 days, 2 hours ago
On 3/30/26 8:11 AM, Alexandre Courbot wrote:
> On Thu Mar 26, 2026 at 10:38 AM JST, John Hubbard wrote:
>> Add boot_fmc() which builds and sends the Chain of Trust message to FSP,
>> and FmcBootArgs which bundles the DMA-coherent boot parameters that FSP
>> reads at boot time. The FspFirmware struct fields become pub(crate) and
>> fmc_full changes from DmaObject to KVec<u8> for CPU-side signature
>> extraction.
>>
>> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/firmware/fsp.rs |   8 +-
>>  drivers/gpu/nova-core/fsp.rs          | 170 +++++++++++++++++++++++++-
>>  drivers/gpu/nova-core/gpu.rs          |   1 -
>>  drivers/gpu/nova-core/mctp.rs         |   7 --
>>  4 files changed, 172 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
>> index 028f651553e0..5bd15b644825 100644
>> --- a/drivers/gpu/nova-core/firmware/fsp.rs
>> +++ b/drivers/gpu/nova-core/firmware/fsp.rs
>> @@ -14,16 +14,16 @@
>>      gpu::Chipset, //
>>  };
>>  
>> -#[expect(unused)]
>> +#[expect(dead_code)]
> 
> Why?

`dead_code` is the specific lint that fires here. `unused` is a lint
group that *contains* `dead_code` plus a dozen others (unused_imports,
unused_variables, unused_mut, etc). Since the only warning is "this
struct is never constructed," `dead_code` names the exact lint.

That said, you're right that this change doesn't belong in this patch.
I've moved it to the introducing commit in v10.

> 
>>  pub(crate) struct FspFirmware {
>>      /// FMC firmware image data (only the "image" ELF section).
>> -    fmc_image: DmaObject,
>> -    /// Full FMC ELF (for signature extraction.
>> +    pub(crate) fmc_image: DmaObject,
>> +    /// Full FMC ELF for signature extraction.
>>      pub(crate) fmc_elf: Firmware,
> 
> Let's make the comment for `fmc_elf` correct since its introduction
> instead of fixing it here?

Agreed, done in v10. The introducing commit now has the correct
comment from the start.

> 
>>  }
>>  
>>  impl FspFirmware {
>> -    #[expect(unused)]
>> +    #[expect(dead_code)]
> 
> Same here, I don't see the point. If it's not unused anymore, let's
> remove the `expect` entirely. If it is, let's keep the more accurate
> `unused`.

One small correction: `unused` is actually the *less* specific choice.
It is a lint group containing dead_code, unused_imports,
unused_variables, and about a dozen others. `dead_code` is the single
lint that actually fires on an unreferenced struct or function.
#[expect(...)] is meant to name the expected warning, so using the
specific lint is more precise.

But you're right that this change belonged in the introducing commit,
not here. Fixed in v10: the introducing commit uses
#[expect(dead_code)] from the start, and this patch only changes
fmc_image to pub(crate).

> 
> Btw, this is the kind of stuff you'd normally catch by looking at the
> diff before sending it. I shouldn't have to do this.

I actually spend lots of time fooling around with dead_code and
unused, trying to get it just right. There are a lot of patches that
add and remove these.

In this case, after the dust settled after applying the v8 review
feedback, I did notice this unused-->dead_code diff that we
are discussing, but I just let it go, because it's only a couple
of lines that end up getting vaporized by the end of the series
anyway, and I was anxious to get all the other things (especially
Gary/Sashiko's big discovery about the two Blackwell arches)
posted.


thanks,
-- 
John Hubbard