[RFC 5/7] gpu: nova-core: set RMSetSriovMode when NVIDIA vGPU is enabled

Zhi Wang posted 7 patches 1 week, 6 days ago
[RFC 5/7] gpu: nova-core: set RMSetSriovMode when NVIDIA vGPU is enabled
Posted by Zhi Wang 1 week, 6 days ago
The registry object "RMSetSriovMode" is required to be set when vGPU is
enabled.

Set "RMSetSriovMode" to 1 when nova-core is loading the GSP firmware and
initialize the GSP registry objects, if vGPU is enabled.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs     |  3 ++-
 drivers/gpu/nova-core/gsp/commands.rs | 23 +++++++++++++++--------
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 5016c630cec3..847ce550eccf 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -168,7 +168,8 @@ pub(crate) fn boot(
 
         self.cmdq
             .send_command(bar, commands::SetSystemInfo::new(pdev, vf_info))?;
-        self.cmdq.send_command(bar, commands::SetRegistry::new())?;
+        self.cmdq
+            .send_command(bar, commands::SetRegistry::new(vgpu_support))?;
 
         gsp_falcon.reset(bar)?;
         let libos_handle = self.libos.dma_handle();
diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
index 1d519c4ed232..00ba48a25444 100644
--- a/drivers/gpu/nova-core/gsp/commands.rs
+++ b/drivers/gpu/nova-core/gsp/commands.rs
@@ -64,16 +64,18 @@ struct RegistryEntry {
 
 /// The `SetRegistry` command.
 pub(crate) struct SetRegistry {
-    entries: [RegistryEntry; Self::NUM_ENTRIES],
+    entries: [RegistryEntry; Self::MAX_NUM_ENTRIES],
+    num_entries: usize,
 }
 
 impl SetRegistry {
     // For now we hard-code the registry entries. Future work will allow others to
     // be added as module parameters.
-    const NUM_ENTRIES: usize = 3;
+    const MAX_NUM_ENTRIES: usize = 4;
 
     /// Creates a new `SetRegistry` command, using a set of hardcoded entries.
-    pub(crate) fn new() -> Self {
+    pub(crate) fn new(vgpu_support: bool) -> Self {
+        let num_entries = if vgpu_support { 4 } else { 3 };
         Self {
             entries: [
                 // RMSecBusResetEnable - enables PCI secondary bus reset
@@ -93,7 +95,12 @@ pub(crate) fn new() -> Self {
                     key: "RMDevidCheckIgnore",
                     value: 1,
                 },
+                RegistryEntry {
+                    key: "RMSetSriovMode",
+                    value: 1,
+                },
             ],
+            num_entries,
         }
     }
 }
@@ -104,15 +111,15 @@ impl CommandToGsp for SetRegistry {
     type InitError = Infallible;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
-        PackedRegistryTable::init(Self::NUM_ENTRIES as u32, self.variable_payload_len() as u32)
+        PackedRegistryTable::init(self.num_entries as u32, self.variable_payload_len() as u32)
     }
 
     fn variable_payload_len(&self) -> usize {
         let mut key_size = 0;
-        for i in 0..Self::NUM_ENTRIES {
+        for i in 0..self.num_entries {
             key_size += self.entries[i].key.len() + 1; // +1 for NULL terminator
         }
-        Self::NUM_ENTRIES * size_of::<PackedRegistryEntry>() + key_size
+        self.num_entries * size_of::<PackedRegistryEntry>() + key_size
     }
 
     fn init_variable_payload(
@@ -120,12 +127,12 @@ fn init_variable_payload(
         dst: &mut SBufferIter<core::array::IntoIter<&mut [u8], 2>>,
     ) -> Result {
         let string_data_start_offset =
-            size_of::<PackedRegistryTable>() + Self::NUM_ENTRIES * size_of::<PackedRegistryEntry>();
+            size_of::<PackedRegistryTable>() + self.num_entries * size_of::<PackedRegistryEntry>();
 
         // Array for string data.
         let mut string_data = KVec::new();
 
-        for entry in self.entries.iter().take(Self::NUM_ENTRIES) {
+        for entry in self.entries.iter().take(self.num_entries) {
             dst.write_all(
                 PackedRegistryEntry::new(
                     (string_data_start_offset + string_data.len()) as u32,
-- 
2.51.0
Re: [RFC 5/7] gpu: nova-core: set RMSetSriovMode when NVIDIA vGPU is enabled
Posted by Timur Tabi 1 week, 4 days ago
On Sat, 2025-12-06 at 12:42 +0000, Zhi Wang wrote:
> -    pub(crate) fn new() -> Self {
> +    pub(crate) fn new(vgpu_support: bool) -> Self {
> +        let num_entries = if vgpu_support { 4 } else { 3 };

Instead of passing a bool, and then hard-coding the length based on that bool (which would
require that RMSetSriovMode always be the last entry in the array), you need to do what Nouveau
does: if VGPU is enabled, then dynamically append the entry to the array.
Re: [RFC 5/7] gpu: nova-core: set RMSetSriovMode when NVIDIA vGPU is enabled
Posted by Alexandre Courbot 4 days, 9 hours ago
On Mon Dec 8, 2025 at 12:55 AM JST, Timur Tabi wrote:
> On Sat, 2025-12-06 at 12:42 +0000, Zhi Wang wrote:
>> -    pub(crate) fn new() -> Self {
>> +    pub(crate) fn new(vgpu_support: bool) -> Self {
>> +        let num_entries = if vgpu_support { 4 } else { 3 };
>
> Instead of passing a bool, and then hard-coding the length based on that bool (which would
> require that RMSetSriovMode always be the last entry in the array), you need to do what Nouveau
> does: if VGPU is enabled, then dynamically append the entry to the array.

Yup, as we will add more stuff to the registry depending on various
conditions it would be great to make it more flexible.

The way for this would be to make `SetRegistry` wrap a `KVec` of
`RegistryEntry` - that way we can add what we need according to the
runtime options.

The current design of `SetRegistry` (which does not directly wraps the
type to write to the command queue, but instead implements
`CommandToGsp` to create it as the command is sent) should make this
rather trivial.
Re: [RFC 5/7] gpu: nova-core: set RMSetSriovMode when NVIDIA vGPU is enabled
Posted by Joel Fernandes 1 week, 4 days ago

> On Dec 7, 2025, at 10:55 AM, Timur Tabi <ttabi@nvidia.com> wrote:
> 
>> On Sat, 2025-12-06 at 12:42 +0000, Zhi Wang wrote:
>> -    pub(crate) fn new() -> Self {
>> +    pub(crate) fn new(vgpu_support: bool) -> Self {
>> +        let num_entries = if vgpu_support { 4 } else { 3 };
> 
> Instead of passing a bool, and then hard-coding the length based on that bool (which would
> require that RMSetSriovMode always be the last entry in the array), you need to do what Nouveau
> does: if VGPU is enabled, then dynamically append the entry to the array.

Yeah, I agree with Timur. 

Thanks.


Re: [RFC 5/7] gpu: nova-core: set RMSetSriovMode when NVIDIA vGPU is enabled
Posted by Zhi Wang 1 week, 2 days ago
On Sun, 7 Dec 2025 16:57:01 +0000
Joel Fernandes <joelagnelf@nvidia.com> wrote:

> 
> 
> > On Dec 7, 2025, at 10:55 AM, Timur Tabi <ttabi@nvidia.com> wrote:
> > 
> >> On Sat, 2025-12-06 at 12:42 +0000, Zhi Wang wrote:
> >> -    pub(crate) fn new() -> Self {
> >> +    pub(crate) fn new(vgpu_support: bool) -> Self {
> >> +        let num_entries = if vgpu_support { 4 } else { 3 };
> > 
> > Instead of passing a bool, and then hard-coding the length based on
> > that bool (which would require that RMSetSriovMode always be the
> > last entry in the array), you need to do what Nouveau does: if VGPU
> > is enabled, then dynamically append the entry to the array.
> 
> Yeah, I agree with Timur. 
> 

Hey Timur and Joe:

Let me see how this could be solved dynamically. Probably need more
changes on other items as well.

Apart from this, I felt that we might need a struct GspBootConfig to
pass around the GSP booting path, while writing these patches. As we
already had coming items, like reserved memory size when vGPU is
enabled or not, vGPU enabled switch, also reserved memory size on
Hopper/Blackwell in John's patch.

It seems we need a central object to host these tuning for GSP booting
up. 

Z.

> Thanks.
> 
>