The current code hands out buffers that go all the way up to and
including `rx - 1`, but we need to maintain an empty slot to prevent the
ring buffer from wrapping around into having 'tx == rx', which means
empty.
Also add more rigorous no-panic proofs.
Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq.rs | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 09c28eeb6f12..aa8758fc7723 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -227,21 +227,26 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
// PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
- if rx <= tx {
- // The area from `tx` up to the end of the ring, and from the beginning of the ring up
- // to `rx`, minus one unit, belongs to the driver.
- if rx == 0 {
- let last = after_tx.len() - 1;
- (&mut after_tx[..last], &mut before_tx[0..0])
- } else {
- (after_tx, &mut before_tx[..rx])
- }
+ // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
+ // belongs to the driver for writing.
+ if rx == 0 {
+ // Since `rx` is zero, leave an empty slot at end of the buffer.
+ let last = after_tx.len() - 1;
+ (&mut after_tx[..last], &mut before_tx[0..0])
+ } else if rx > tx {
+ // The area is contiguous and we leave an empty slot before `rx`.
+ // PANIC:
+ // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
+ // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
+ // because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
+ (&mut after_tx[..(rx - tx - 1)], &mut before_tx[0..0])
} else {
- // The area from `tx` to `rx`, minus one unit, belongs to the driver.
- //
- // PANIC: per the invariants of `cpu_write_ptr` and `gsp_read_ptr`, `rx` and `tx` are
- // `<= MSGQ_NUM_PAGES`, and the test above ensured that `rx > tx`.
- (after_tx.split_at_mut(rx - tx).0, &mut before_tx[0..0])
+ // The area is discontiguous and we leave an empty slot before `rx`.
+ // PANIC:
+ // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
+ // - The index does not exceed `before_tx.len()` (which equals `tx`) because
+ // `rx <= tx` in this branch.
+ (after_tx, &mut before_tx[..(rx - 1)])
}
}
--
2.52.0
On Fri Jan 23, 2026 at 9:12 PM JST, Eliot Courtney wrote:
> The current code hands out buffers that go all the way up to and
> including `rx - 1`, but we need to maintain an empty slot to prevent the
> ring buffer from wrapping around into having 'tx == rx', which means
> empty.
>
> Also add more rigorous no-panic proofs.
>
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
> drivers/gpu/nova-core/gsp/cmdq.rs | 33 +++++++++++++++++++--------------
> 1 file changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 09c28eeb6f12..aa8758fc7723 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -227,21 +227,26 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
> // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
> let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
>
> - if rx <= tx {
> - // The area from `tx` up to the end of the ring, and from the beginning of the ring up
> - // to `rx`, minus one unit, belongs to the driver.
> - if rx == 0 {
> - let last = after_tx.len() - 1;
> - (&mut after_tx[..last], &mut before_tx[0..0])
> - } else {
> - (after_tx, &mut before_tx[..rx])
Indeed, the comment clearly states "minus one unit" but the second
branch (and the one below) ignores that. >_<
> - }
> + // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
> + // belongs to the driver for writing.
I have trouble parsing this comment - it says `rx - 2` but nowhere does
that expression appear in the code. It's particularly confusing because
it apparently applies to all 3 blocks, but the position suggests it is
only the first. I'd add a blank line after it to signal that it is
relevant to the rest of the method.
> + if rx == 0 {
> + // Since `rx` is zero, leave an empty slot at end of the buffer.
> + let last = after_tx.len() - 1;
> + (&mut after_tx[..last], &mut before_tx[0..0])
> + } else if rx > tx {
In the original code, the `rx < tx` case appears first. Can we preserve
this order so the corresponding diffs also appear next to one another?
> + // The area is contiguous and we leave an empty slot before `rx`.
> + // PANIC:
> + // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
> + // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
> + // because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
> + (&mut after_tx[..(rx - tx - 1)], &mut before_tx[0..0])
> } else {
> - // The area from `tx` to `rx`, minus one unit, belongs to the driver.
> - //
> - // PANIC: per the invariants of `cpu_write_ptr` and `gsp_read_ptr`, `rx` and `tx` are
> - // `<= MSGQ_NUM_PAGES`, and the test above ensured that `rx > tx`.
> - (after_tx.split_at_mut(rx - tx).0, &mut before_tx[0..0])
> + // The area is discontiguous and we leave an empty slot before `rx`.
> + // PANIC:
> + // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
> + // - The index does not exceed `before_tx.len()` (which equals `tx`) because
> + // `rx <= tx` in this branch.
> + (after_tx, &mut before_tx[..(rx - 1)])
In any case, nice catch! This would have bitten us sooner or later.
On Fri Jan 23, 2026 at 12:12 PM GMT, Eliot Courtney wrote:
> The current code hands out buffers that go all the way up to and
> including `rx - 1`, but we need to maintain an empty slot to prevent the
> ring buffer from wrapping around into having 'tx == rx', which means
> empty.
>
> Also add more rigorous no-panic proofs.
>
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
> drivers/gpu/nova-core/gsp/cmdq.rs | 33 +++++++++++++++++++--------------
> 1 file changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 09c28eeb6f12..aa8758fc7723 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -227,21 +227,26 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
> // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
> let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
>
> - if rx <= tx {
> - // The area from `tx` up to the end of the ring, and from the beginning of the ring up
> - // to `rx`, minus one unit, belongs to the driver.
> - if rx == 0 {
> - let last = after_tx.len() - 1;
> - (&mut after_tx[..last], &mut before_tx[0..0])
> - } else {
> - (after_tx, &mut before_tx[..rx])
> - }
> + // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
> + // belongs to the driver for writing.
> + if rx == 0 {
> + // Since `rx` is zero, leave an empty slot at end of the buffer.
> + let last = after_tx.len() - 1;
> + (&mut after_tx[..last], &mut before_tx[0..0])
Does the address actually matter? Otherwise I would find `&mut []` easier to
understand than an empty indexing.
> + } else if rx > tx {
> + // The area is contiguous and we leave an empty slot before `rx`.
> + // PANIC:
> + // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
> + // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
> + // because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
> + (&mut after_tx[..(rx - tx - 1)], &mut before_tx[0..0])
> } else {
> - // The area from `tx` to `rx`, minus one unit, belongs to the driver.
> - //
> - // PANIC: per the invariants of `cpu_write_ptr` and `gsp_read_ptr`, `rx` and `tx` are
> - // `<= MSGQ_NUM_PAGES`, and the test above ensured that `rx > tx`.
> - (after_tx.split_at_mut(rx - tx).0, &mut before_tx[0..0])
> + // The area is discontiguous and we leave an empty slot before `rx`.
> + // PANIC:
> + // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
> + // - The index does not exceed `before_tx.len()` (which equals `tx`) because
> + // `rx <= tx` in this branch.
> + (after_tx, &mut before_tx[..(rx - 1)])
If this is written with get_disjoint_mut, the indices would be so much easier to
understand... To bad that that API is only available from 1.86 onwards.
Best,
Gary
> }
> }
>
On Tue Jan 27, 2026 at 3:26 AM JST, Gary Guo wrote:
> On Fri Jan 23, 2026 at 12:12 PM GMT, Eliot Courtney wrote:
>> The current code hands out buffers that go all the way up to and
>> including `rx - 1`, but we need to maintain an empty slot to prevent the
>> ring buffer from wrapping around into having 'tx == rx', which means
>> empty.
>>
>> Also add more rigorous no-panic proofs.
>>
>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
>> ---
>> drivers/gpu/nova-core/gsp/cmdq.rs | 33 +++++++++++++++++++--------------
>> 1 file changed, 19 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index 09c28eeb6f12..aa8758fc7723 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -227,21 +227,26 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>> // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
>> let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
>>
>> - if rx <= tx {
>> - // The area from `tx` up to the end of the ring, and from the beginning of the ring up
>> - // to `rx`, minus one unit, belongs to the driver.
>> - if rx == 0 {
>> - let last = after_tx.len() - 1;
>> - (&mut after_tx[..last], &mut before_tx[0..0])
>> - } else {
>> - (after_tx, &mut before_tx[..rx])
>> - }
>> + // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
>> + // belongs to the driver for writing.
>> + if rx == 0 {
>> + // Since `rx` is zero, leave an empty slot at end of the buffer.
>> + let last = after_tx.len() - 1;
>> + (&mut after_tx[..last], &mut before_tx[0..0])
>
> Does the address actually matter? Otherwise I would find `&mut []` easier to
> understand than an empty indexing.
The address doesn't matter, and indeed I am not sure why we did that
(possibly a lifetime issue with some previous version of the code?).
Your suggestion seems to work fine, so Eliot feel free to include it in
your series (as a separate patch please to make sure we can isolate the
effects of both changes).
>
>> + } else if rx > tx {
>> + // The area is contiguous and we leave an empty slot before `rx`.
>> + // PANIC:
>> + // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
>> + // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
>> + // because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
>> + (&mut after_tx[..(rx - tx - 1)], &mut before_tx[0..0])
>> } else {
>> - // The area from `tx` to `rx`, minus one unit, belongs to the driver.
>> - //
>> - // PANIC: per the invariants of `cpu_write_ptr` and `gsp_read_ptr`, `rx` and `tx` are
>> - // `<= MSGQ_NUM_PAGES`, and the test above ensured that `rx > tx`.
>> - (after_tx.split_at_mut(rx - tx).0, &mut before_tx[0..0])
>> + // The area is discontiguous and we leave an empty slot before `rx`.
>> + // PANIC:
>> + // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
>> + // - The index does not exceed `before_tx.len()` (which equals `tx`) because
>> + // `rx <= tx` in this branch.
>> + (after_tx, &mut before_tx[..(rx - 1)])
>
> If this is written with get_disjoint_mut, the indices would be so much easier to
> understand... To bad that that API is only available from 1.86 onwards.
I for one am longing for `split_at_checked`... :)
© 2016 - 2026 Red Hat, Inc.