[PATCH v3 25/30] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap

John Hubbard posted 30 patches 2 days, 23 hours ago
[PATCH v3 25/30] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
Posted by John Hubbard 2 days, 23 hours ago
Hopper, Blackwell and later GPUs require a larger heap for WPR2.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/fw.rs | 55 ++++++++++++++++++++++++++-------
 1 file changed, 44 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 086153edfa86..0d1d9eecd77c 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -49,21 +49,41 @@ enum GspFwHeapParams {}
 /// Minimum required alignment for the GSP heap.
 const GSP_HEAP_ALIGNMENT: Alignment = Alignment::new::<{ 1 << 20 }>();
 
+// These constants override the generated bindings for architecture-specific heap sizing.
+//
+// 14MB for Hopper/Blackwell+.
+const GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100: u64 = 14 * SZ_1M as u64;
+// 142MB client alloc for ~188MB total.
+const GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100: u64 = 142 * SZ_1M as u64;
+// Blackwell-specific minimum heap size (88 + 12 + 70 = 170MB)
+const GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB_BLACKWELL: u64 = 170;
+
 impl GspFwHeapParams {
     /// Returns the amount of GSP-RM heap memory used during GSP-RM boot and initialization (up to
     /// and including the first client subdevice allocation).
-    fn base_rm_size(_chipset: Chipset) -> u64 {
-        // TODO: this needs to be updated to return the correct value for Hopper+ once support for
-        // them is added:
-        // u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100)
-        u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X)
+    fn base_rm_size(chipset: Chipset) -> u64 {
+        if crate::fb::hal::fb_hal(chipset)
+            .non_wpr_heap_size()
+            .is_some()
+        {
+            GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
+        } else {
+            u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X)
+        }
     }
 
     /// Returns the amount of heap memory required to support a single channel allocation.
-    fn client_alloc_size() -> u64 {
-        u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
-            .align_up(GSP_HEAP_ALIGNMENT)
-            .unwrap_or(u64::MAX)
+    fn client_alloc_size(chipset: Chipset) -> u64 {
+        if crate::fb::hal::fb_hal(chipset)
+            .non_wpr_heap_size()
+            .is_some()
+        {
+            GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100
+        } else {
+            u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
+        }
+        .align_up(GSP_HEAP_ALIGNMENT)
+        .unwrap_or(u64::MAX)
     }
 
     /// Returns the amount of memory to reserve for management purposes for a framebuffer of size
@@ -106,12 +126,25 @@ impl LibosParams {
                 * num::usize_as_u64(SZ_1M),
     };
 
+    /// Hopper/Blackwell+ GPUs need a larger minimum heap size than the bindings specify.
+    /// The r570 bindings set LIBOS3_BAREMETAL_MIN_MB to 88MB, but Hopper/Blackwell+ actually
+    /// requires 170MB (88 + 12 + 70).
+    const LIBOS_BLACKWELL: LibosParams = LibosParams {
+        carveout_size: num::u32_as_u64(bindings::GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS3_BAREMETAL),
+        allowed_heap_size: GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB_BLACKWELL
+            * num::usize_as_u64(SZ_1M)
+            ..num::u32_as_u64(bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MAX_MB)
+                * num::usize_as_u64(SZ_1M),
+    };
+
     /// Returns the libos parameters corresponding to `chipset`.
     pub(crate) fn from_chipset(chipset: Chipset) -> &'static LibosParams {
         if chipset < Chipset::GA102 {
             &Self::LIBOS2
-        } else {
+        } else if chipset < Chipset::GH100 {
             &Self::LIBOS3
+        } else {
+            &Self::LIBOS_BLACKWELL
         }
     }
 
@@ -124,7 +157,7 @@ pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> u64 {
             // RM boot working memory,
             .saturating_add(GspFwHeapParams::base_rm_size(chipset))
             // One RM client,
-            .saturating_add(GspFwHeapParams::client_alloc_size())
+            .saturating_add(GspFwHeapParams::client_alloc_size(chipset))
             // Overhead for memory management.
             .saturating_add(GspFwHeapParams::management_overhead(fb_size))
             // Clamp to the supported heap sizes.
-- 
2.53.0
Re: [PATCH v3 25/30] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
Posted by Timur Tabi 2 days, 8 hours ago
On Thu, 2026-02-05 at 20:21 -0800, John Hubbard wrote:
> +    fn client_alloc_size(chipset: Chipset) -> u64 {
> +        if crate::fb::hal::fb_hal(chipset)
> +            .non_wpr_heap_size()
> +            .is_some()
> +        {
> +            GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100
> +        } else {
> +            u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
> +        }
> +        .align_up(GSP_HEAP_ALIGNMENT)
> +        .unwrap_or(u64::MAX)

I understand not wanting to return a Result<> for impossible situations, but I don't know if this is
the answer.  For one thing, there is no way we can allocate a heap that large.  

Maybe we need to have some kind of infallible align_up() variant, but in this case, I would say a
panic is warranted.
Re: [PATCH v3 25/30] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
Posted by John Hubbard an hour ago
On 2/6/26 11:03 AM, Timur Tabi wrote:
> On Thu, 2026-02-05 at 20:21 -0800, John Hubbard wrote:
>> +    fn client_alloc_size(chipset: Chipset) -> u64 {
>> +        if crate::fb::hal::fb_hal(chipset)
>> +            .non_wpr_heap_size()
>> +            .is_some()
>> +        {
>> +            GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100
>> +        } else {
>> +            u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
>> +        }
>> +        .align_up(GSP_HEAP_ALIGNMENT)
>> +        .unwrap_or(u64::MAX)
> 
> I understand not wanting to return a Result<> for impossible situations, but I don't know if this is
> the answer.  For one thing, there is no way we can allocate a heap that large.
> 
> Maybe we need to have some kind of infallible align_up() variant, but in this case, I would say a
> panic is warranted.

You are right. And that agrees with Miguel Ojeda's overall philosophy
about handling this type of error, too.

I'll change it to this for the next version:

diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 0d1d9eecd77c..927bcee6a5a5 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -83,7 +83,7 @@ fn client_alloc_size(chipset: Chipset) -> u64 {
              u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
          }
          .align_up(GSP_HEAP_ALIGNMENT)
-        .unwrap_or(u64::MAX)
+        .expect("client_alloc_size alignment overflow")
      }
  
      /// Returns the amount of memory to reserve for management purposes for a framebuffer of size
@@ -94,7 +94,7 @@ fn management_overhead(fb_size: u64) -> u64 {
          u64::from(bindings::GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB)
              .saturating_mul(fb_size_gb)
              .align_up(GSP_HEAP_ALIGNMENT)
-            .unwrap_or(u64::MAX)
+            .expect("management_overhead alignment overflow")
      }
  }


thanks,
-- 
John Hubbard