gpu: nova-core: fixups for GSP message queue and bindings

[PATCH v3 2/4] gpu: nova-core: gsp: fix length of received messages

Posted by Alexandre Courbot 1 month, 3 weeks ago

The size of messages' payload is miscalculated, leading to extra data
passed to the message handler. While this is not a problem with our
current set of commands, others with a variable-length payload may
misbehave. Fix this by introducing a method returning the payload size
and using it.

Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 10 ++++++----
 drivers/gpu/nova-core/gsp/fw.rs   | 13 +++++++++----
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 6f946d14868a..7985a0b3f769 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -588,21 +588,23 @@ fn wait_for_msg(&self, timeout: Delta) -> Result<GspMessage<'_>> {
             header.length(),
         );
 
+        let payload_length = header.payload_length();
+
         // Check that the driver read area is large enough for the message.
-        if slice_1.len() + slice_2.len() < header.length() {
+        if slice_1.len() + slice_2.len() < payload_length {
             return Err(EIO);
         }
 
         // Cut the message slices down to the actual length of the message.
-        let (slice_1, slice_2) = if slice_1.len() > header.length() {
+        let (slice_1, slice_2) = if slice_1.len() > payload_length {
             // PANIC: we checked above that `slice_1` is at least as long as `msg_header.length()`.
-            (slice_1.split_at(header.length()).0, &slice_2[0..0])
+            (slice_1.split_at(payload_length).0, &slice_2[0..0])
         } else {
             (
                 slice_1,
                 // PANIC: we checked above that `slice_1.len() + slice_2.len()` is at least as
                 // large as `msg_header.length()`.
-                slice_2.split_at(header.length() - slice_1.len()).0,
+                slice_2.split_at(payload_length - slice_1.len()).0,
             )
         };
 
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index abffd6beec65..7b8e710b33e7 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -853,11 +853,16 @@ pub(crate) fn set_checksum(&mut self, checksum: u32) {
         self.inner.checkSum = checksum;
     }
 
-    /// Returns the total length of the message.
+    /// Returns the length of the message's payload.
+    pub(crate) fn payload_length(&self) -> usize {
+        // `rpc.length` includes the length of the RPC message header.
+        num::u32_as_usize(self.inner.rpc.length)
+            .saturating_sub(size_of::<bindings::rpc_message_header_v>())
+    }
+
+    /// Returns the total length of the message, message and RPC headers included.
     pub(crate) fn length(&self) -> usize {
-        // `rpc.length` includes the length of the GspRpcHeader but not the message header.
-        size_of::<Self>() - size_of::<bindings::rpc_message_header_v>()
-            + num::u32_as_usize(self.inner.rpc.length)
+        size_of::<Self>() + self.payload_length()
     }
 
     // Returns the sequence number of the message.

-- 
2.52.0

Re: [PATCH v3 2/4] gpu: nova-core: gsp: fix length of received messages

Posted by Joel Fernandes 1 month, 3 weeks ago

Hi Alex,

> On Dec 15, 2025, at 9:57 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
> 
> The size of messages' payload is miscalculated, leading to extra data
> passed to the message handler. While this is not a problem with our
> current set of commands, others with a variable-length payload may
> misbehave. Fix this by introducing a method returning the payload size
> and using it.
> 
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Reviewed-by: Lyude Paul <lyude@redhat.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
> drivers/gpu/nova-core/gsp/cmdq.rs | 10 ++++++----
> drivers/gpu/nova-core/gsp/fw.rs   | 13 +++++++++----
> 2 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 6f946d14868a..7985a0b3f769 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -588,21 +588,23 @@ fn wait_for_msg(&self, timeout: Delta) -> Result<GspMessage<'_>> {
>             header.length(),
>         );
> 
> +        let payload_length = header.payload_length();
> +
>         // Check that the driver read area is large enough for the message.
> -        if slice_1.len() + slice_2.len() < header.length() {
> +        if slice_1.len() + slice_2.len() < payload_length {
>             return Err(EIO);
>         }
> 
>         // Cut the message slices down to the actual length of the message.
> -        let (slice_1, slice_2) = if slice_1.len() > header.length() {
> +        let (slice_1, slice_2) = if slice_1.len() > payload_length {
>             // PANIC: we checked above that `slice_1` is at least as long as `msg_header.length()`.
> -            (slice_1.split_at(header.length()).0, &slice_2[0..0])
> +            (slice_1.split_at(payload_length).0, &slice_2[0..0])
>         } else {
>             (
>                 slice_1,
>                 // PANIC: we checked above that `slice_1.len() + slice_2.len()` is at least as
>                 // large as `msg_header.length()`.
> -                slice_2.split_at(header.length() - slice_1.len()).0,
> +                slice_2.split_at(payload_length - slice_1.len()).0,

The panic comments also need updating?

Other than the nit,

Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>

thanks,

 - Joel



>             )
>         };
> 
> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
> index abffd6beec65..7b8e710b33e7 100644
> --- a/drivers/gpu/nova-core/gsp/fw.rs
> +++ b/drivers/gpu/nova-core/gsp/fw.rs
> @@ -853,11 +853,16 @@ pub(crate) fn set_checksum(&mut self, checksum: u32) {
>         self.inner.checkSum = checksum;
>     }
> 
> -    /// Returns the total length of the message.
> +    /// Returns the length of the message's payload.
> +    pub(crate) fn payload_length(&self) -> usize {
> +        // `rpc.length` includes the length of the RPC message header.
> +        num::u32_as_usize(self.inner.rpc.length)
> +            .saturating_sub(size_of::<bindings::rpc_message_header_v>())
> +    }
> +
> +    /// Returns the total length of the message, message and RPC headers included.
>     pub(crate) fn length(&self) -> usize {
> -        // `rpc.length` includes the length of the GspRpcHeader but not the message header.
> -        size_of::<Self>() - size_of::<bindings::rpc_message_header_v>()
> -            + num::u32_as_usize(self.inner.rpc.length)
> +        size_of::<Self>() + self.payload_length()
>     }
> 
>     // Returns the sequence number of the message.
> 
> --
> 2.52.0
>

Re: [PATCH v3 2/4] gpu: nova-core: gsp: fix length of received messages

Posted by Alexandre Courbot 1 month, 3 weeks ago

On Tue Dec 16, 2025 at 6:21 PM JST, Joel Fernandes wrote:
> Hi Alex,
>
>> On Dec 15, 2025, at 9:57 PM, Alexandre Courbot <acourbot@nvidia.com> wrote:
>> 
>> The size of messages' payload is miscalculated, leading to extra data
>> passed to the message handler. While this is not a problem with our
>> current set of commands, others with a variable-length payload may
>> misbehave. Fix this by introducing a method returning the payload size
>> and using it.
>> 
>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>> Reviewed-by: Lyude Paul <lyude@redhat.com>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>> drivers/gpu/nova-core/gsp/cmdq.rs | 10 ++++++----
>> drivers/gpu/nova-core/gsp/fw.rs   | 13 +++++++++----
>> 2 files changed, 15 insertions(+), 8 deletions(-)
>> 
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index 6f946d14868a..7985a0b3f769 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -588,21 +588,23 @@ fn wait_for_msg(&self, timeout: Delta) -> Result<GspMessage<'_>> {
>>             header.length(),
>>         );
>> 
>> +        let payload_length = header.payload_length();
>> +
>>         // Check that the driver read area is large enough for the message.
>> -        if slice_1.len() + slice_2.len() < header.length() {
>> +        if slice_1.len() + slice_2.len() < payload_length {
>>             return Err(EIO);
>>         }
>> 
>>         // Cut the message slices down to the actual length of the message.
>> -        let (slice_1, slice_2) = if slice_1.len() > header.length() {
>> +        let (slice_1, slice_2) = if slice_1.len() > payload_length {
>>             // PANIC: we checked above that `slice_1` is at least as long as `msg_header.length()`.
>> -            (slice_1.split_at(header.length()).0, &slice_2[0..0])
>> +            (slice_1.split_at(payload_length).0, &slice_2[0..0])
>>         } else {
>>             (
>>                 slice_1,
>>                 // PANIC: we checked above that `slice_1.len() + slice_2.len()` is at least as
>>                 // large as `msg_header.length()`.
>> -                slice_2.split_at(header.length() - slice_1.len()).0,
>> +                slice_2.split_at(payload_length - slice_1.len()).0,
>
> The panic comments also need updating?

Oh, they do - nice catch, thanks!

Re: [PATCH v3 2/4] gpu: nova-core: gsp: fix length of received messages

Posted by Alistair Popple 1 month, 3 weeks ago

On 2025-12-16 at 13:57 +1100, Alexandre Courbot <acourbot@nvidia.com> wrote...
> The size of messages' payload is miscalculated, leading to extra data
> passed to the message handler. While this is not a problem with our
> current set of commands, others with a variable-length payload may
> misbehave. Fix this by introducing a method returning the payload size
> and using it.

The whole inconsistency of the message element struct not including it's header
fields in the size whilst the rpc struct does has caused endless confusion, this
looks much better, thanks for fixing!

Reviewed-by: Alistair Popple <apopple@nvidia.com>

> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Reviewed-by: Lyude Paul <lyude@redhat.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/gsp/cmdq.rs | 10 ++++++----
>  drivers/gpu/nova-core/gsp/fw.rs   | 13 +++++++++----
>  2 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 6f946d14868a..7985a0b3f769 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -588,21 +588,23 @@ fn wait_for_msg(&self, timeout: Delta) -> Result<GspMessage<'_>> {
>              header.length(),
>          );
>  
> +        let payload_length = header.payload_length();
> +
>          // Check that the driver read area is large enough for the message.
> -        if slice_1.len() + slice_2.len() < header.length() {
> +        if slice_1.len() + slice_2.len() < payload_length {
>              return Err(EIO);
>          }
>  
>          // Cut the message slices down to the actual length of the message.
> -        let (slice_1, slice_2) = if slice_1.len() > header.length() {
> +        let (slice_1, slice_2) = if slice_1.len() > payload_length {
>              // PANIC: we checked above that `slice_1` is at least as long as `msg_header.length()`.
> -            (slice_1.split_at(header.length()).0, &slice_2[0..0])
> +            (slice_1.split_at(payload_length).0, &slice_2[0..0])
>          } else {
>              (
>                  slice_1,
>                  // PANIC: we checked above that `slice_1.len() + slice_2.len()` is at least as
>                  // large as `msg_header.length()`.
> -                slice_2.split_at(header.length() - slice_1.len()).0,
> +                slice_2.split_at(payload_length - slice_1.len()).0,
>              )
>          };
>  
> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
> index abffd6beec65..7b8e710b33e7 100644
> --- a/drivers/gpu/nova-core/gsp/fw.rs
> +++ b/drivers/gpu/nova-core/gsp/fw.rs
> @@ -853,11 +853,16 @@ pub(crate) fn set_checksum(&mut self, checksum: u32) {
>          self.inner.checkSum = checksum;
>      }
>  
> -    /// Returns the total length of the message.
> +    /// Returns the length of the message's payload.
> +    pub(crate) fn payload_length(&self) -> usize {
> +        // `rpc.length` includes the length of the RPC message header.
> +        num::u32_as_usize(self.inner.rpc.length)
> +            .saturating_sub(size_of::<bindings::rpc_message_header_v>())
> +    }
> +
> +    /// Returns the total length of the message, message and RPC headers included.
>      pub(crate) fn length(&self) -> usize {
> -        // `rpc.length` includes the length of the GspRpcHeader but not the message header.
> -        size_of::<Self>() - size_of::<bindings::rpc_message_header_v>()
> -            + num::u32_as_usize(self.inner.rpc.length)
> +        size_of::<Self>() + self.payload_length()
>      }
>  
>      // Returns the sequence number of the message.
> 
> -- 
> 2.52.0
>

Re: [PATCH v3 2/4] gpu: nova-core: gsp: fix length of received messages

Posted by Alexandre Courbot 1 month, 3 weeks ago

On Tue Dec 16, 2025 at 3:14 PM JST, Alistair Popple wrote:
> On 2025-12-16 at 13:57 +1100, Alexandre Courbot <acourbot@nvidia.com> wrote...
>> The size of messages' payload is miscalculated, leading to extra data
>> passed to the message handler. While this is not a problem with our
>> current set of commands, others with a variable-length payload may
>> misbehave. Fix this by introducing a method returning the payload size
>> and using it.
>
> The whole inconsistency of the message element struct not including it's header
> fields in the size whilst the rpc struct does has caused endless confusion, this
> looks much better, thanks for fixing!

Indeed. It would be so much simpler if the RPC header just included the
size of its *payload* - because if we have the header to begin with, of
course it is part of the message!

Instead we have to deal with the possibility of a nonsensical length
value if it is shorter than that of the header. Thankfully a saturating
sub always yields a correct behavior, else we would have to return a
`Result`. :/

>
> Reviewed-by: Alistair Popple <apopple@nvidia.com>

Thanks!