drivers/gpu/nova-core/driver.rs | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
The auxiliary device registration was using a hardcoded ID of 0, which
caused probe() to fail on multi-GPU systems with:
sysfs: cannot create duplicate filename '/bus/auxiliary/devices/NovaCore.nova-drm.0'
Fix this by using an atomic counter to generate unique IDs for each
GPU's aux device registration. The TODO item to eventually use XArray
for recycling aux device IDs is retained, but for now, this works very
nicely.
This has the side effect of making debugfs[1] work on multi-GPU systems.
[1] https://lore.kernel.org/20260203224757.871729-1-ttabi@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
drivers/gpu/nova-core/driver.rs | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
Hi,
This is based on today's (Feb 4, 2026) linux-next/master branch.
thanks,
John Hubbard
diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 5a4cc047bcfc..a542ec0b40fa 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -1,5 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
+use core::sync::atomic::{AtomicU32, Ordering};
+
use kernel::{
auxiliary,
device::Core,
@@ -19,6 +21,9 @@
use crate::gpu::Gpu;
+/// Counter for generating unique auxiliary device IDs.
+static AUXILIARY_ID_COUNTER: AtomicU32 = AtomicU32::new(0);
+
#[pin_data]
pub(crate) struct NovaCore {
#[pin]
@@ -85,12 +90,17 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
GFP_KERNEL,
)?;
+ // TODO[XARR]: Use XArray for proper ID allocation/recycling; for now we use a simple
+ // atomic counter which never recycles IDs. A unique ID is required for multi-GPU
+ // systems; without it, probe() fails for all but the first GPU.
+ let aux_id = AUXILIARY_ID_COUNTER.fetch_add(1, Ordering::Relaxed);
+
Ok(try_pin_init!(Self {
gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
_reg <- auxiliary::Registration::new(
pdev.as_ref(),
c"nova-drm",
- 0, // TODO[XARR]: Once it lands, use XArray; for now we don't use the ID.
+ aux_id,
crate::MODULE_NAME
),
}))
base-commit: 0f8a890c4524d6e4013ff225e70de2aed7e6d726
--
2.53.0
On Thu Feb 5, 2026 at 4:11 AM GMT, John Hubbard wrote:
> The auxiliary device registration was using a hardcoded ID of 0, which
> caused probe() to fail on multi-GPU systems with:
>
> sysfs: cannot create duplicate filename '/bus/auxiliary/devices/NovaCore.nova-drm.0'
>
> Fix this by using an atomic counter to generate unique IDs for each
> GPU's aux device registration. The TODO item to eventually use XArray
> for recycling aux device IDs is retained, but for now, this works very
> nicely.
>
> This has the side effect of making debugfs[1] work on multi-GPU systems.
Hi John,
Looks like this is something that should be achieved via IDA?
Cc: Matthew Wilcox <willy@infradead.org>
>
> [1] https://lore.kernel.org/20260203224757.871729-1-ttabi@nvidia.com
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
> drivers/gpu/nova-core/driver.rs | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> Hi,
>
> This is based on today's (Feb 4, 2026) linux-next/master branch.
>
> thanks,
> John Hubbard
>
> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
> index 5a4cc047bcfc..a542ec0b40fa 100644
> --- a/drivers/gpu/nova-core/driver.rs
> +++ b/drivers/gpu/nova-core/driver.rs
> @@ -1,5 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0
>
> +use core::sync::atomic::{AtomicU32, Ordering};
We're stopping the use of Rust atomics. Please use LKMM atomics available from
`kernel::sync::atomic`.
Best,
Gary
> +
> use kernel::{
> auxiliary,
> device::Core,
> @@ -19,6 +21,9 @@
>
> use crate::gpu::Gpu;
>
> +/// Counter for generating unique auxiliary device IDs.
> +static AUXILIARY_ID_COUNTER: AtomicU32 = AtomicU32::new(0);
> +
> #[pin_data]
> pub(crate) struct NovaCore {
> #[pin]
> @@ -85,12 +90,17 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
> GFP_KERNEL,
> )?;
>
> + // TODO[XARR]: Use XArray for proper ID allocation/recycling; for now we use a simple
> + // atomic counter which never recycles IDs. A unique ID is required for multi-GPU
> + // systems; without it, probe() fails for all but the first GPU.
> + let aux_id = AUXILIARY_ID_COUNTER.fetch_add(1, Ordering::Relaxed);
> +
> Ok(try_pin_init!(Self {
> gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
> _reg <- auxiliary::Registration::new(
> pdev.as_ref(),
> c"nova-drm",
> - 0, // TODO[XARR]: Once it lands, use XArray; for now we don't use the ID.
> + aux_id,
> crate::MODULE_NAME
> ),
> }))
>
> base-commit: 0f8a890c4524d6e4013ff225e70de2aed7e6d726
On Thu, Feb 05, 2026 at 01:44:27PM +0000, Gary Guo wrote: > > Fix this by using an atomic counter to generate unique IDs for each > > GPU's aux device registration. The TODO item to eventually use XArray > > for recycling aux device IDs is retained, but for now, this works very > > nicely. > > > > This has the side effect of making debugfs[1] work on multi-GPU systems. > > Hi John, > > Looks like this is something that should be achieved via IDA? Yes, if you have no need to go from ID to pointer, an IDA is better. That said, as far as I understand what this code is doing, an atomic_t solves the problem just fine and is cheaper.
On Thu Feb 5, 2026 at 2:48 PM CET, Matthew Wilcox wrote: > On Thu, Feb 05, 2026 at 01:44:27PM +0000, Gary Guo wrote: >> > Fix this by using an atomic counter to generate unique IDs for each >> > GPU's aux device registration. The TODO item to eventually use XArray >> > for recycling aux device IDs is retained, but for now, this works very >> > nicely. >> > >> > This has the side effect of making debugfs[1] work on multi-GPU systems. >> >> Hi John, >> >> Looks like this is something that should be achieved via IDA? > > Yes, if you have no need to go from ID to pointer, an IDA is better. > That said, as far as I understand what this code is doing, an atomic_t > solves the problem just fine and is cheaper. I agree, for now an atomic should be perfectly fine. Though, with enough patience binding/unbinding the driver from sysfs you can probably make this overflow. :) The reason for the Xarray TODO is that it is one option for a place where nova-core can store nova-drm / vGPU specific data, once either vGPU or nova-drm attaches to the auxiliary device. But I think there may be better alternatives.
On 2/5/26 6:19 AM, Danilo Krummrich wrote: > On Thu Feb 5, 2026 at 2:48 PM CET, Matthew Wilcox wrote: >> On Thu, Feb 05, 2026 at 01:44:27PM +0000, Gary Guo wrote: >>>> Fix this by using an atomic counter to generate unique IDs for each >>>> GPU's aux device registration. The TODO item to eventually use XArray >>>> for recycling aux device IDs is retained, but for now, this works very >>>> nicely. >>>> >>>> This has the side effect of making debugfs[1] work on multi-GPU systems. >>> >>> Hi John, >>> >>> Looks like this is something that should be achieved via IDA? >> >> Yes, if you have no need to go from ID to pointer, an IDA is better. >> That said, as far as I understand what this code is doing, an atomic_t >> solves the problem just fine and is cheaper. > > I agree, for now an atomic should be perfectly fine. Though, with enough > patience binding/unbinding the driver from sysfs you can probably make this > overflow. :) > > The reason for the Xarray TODO is that it is one option for a place where > nova-core can store nova-drm / vGPU specific data, once either vGPU or nova-drm > attaches to the auxiliary device. But I think there may be better alternatives. OK, this seems like enough information to post a v2, thanks! thanks, -- John Hubbard
On 2/4/26 8:11 PM, John Hubbard wrote:
> The auxiliary device registration was using a hardcoded ID of 0, which
> caused probe() to fail on multi-GPU systems with:
>
> sysfs: cannot create duplicate filename '/bus/auxiliary/devices/NovaCore.nova-drm.0'
>
> Fix this by using an atomic counter to generate unique IDs for each
> GPU's aux device registration. The TODO item to eventually use XArray
> for recycling aux device IDs is retained, but for now, this works very
> nicely.
>
> This has the side effect of making debugfs[1] work on multi-GPU systems.
>
> [1] https://lore.kernel.org/20260203224757.871729-1-ttabi@nvidia.com
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
> drivers/gpu/nova-core/driver.rs | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> Hi,
>
> This is based on today's (Feb 4, 2026) linux-next/master branch.
>
> thanks,
> John Hubbard
>
> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
> index 5a4cc047bcfc..a542ec0b40fa 100644
> --- a/drivers/gpu/nova-core/driver.rs
> +++ b/drivers/gpu/nova-core/driver.rs
> @@ -1,5 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0
>
> +use core::sync::atomic::{AtomicU32, Ordering};
Somehow the wrong (non-vertical) formatting snuck back into
my patch! Arggh. I'll be glad when rustfmt support for this
can help me catch this.
> +
> use kernel::{
> auxiliary,
> device::Core,
> @@ -19,6 +21,9 @@
>
> use crate::gpu::Gpu;
>
> +/// Counter for generating unique auxiliary device IDs.
> +static AUXILIARY_ID_COUNTER: AtomicU32 = AtomicU32::new(0);
> +
> #[pin_data]
> pub(crate) struct NovaCore {
> #[pin]
> @@ -85,12 +90,17 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
> GFP_KERNEL,
> )?;
>
> + // TODO[XARR]: Use XArray for proper ID allocation/recycling; for now we use a simple
I also did *not* mean to leave the word "we" in there.
Lots of little glitches tonight, sorry about those.
thanks,
--
John Hubbard
> + // atomic counter which never recycles IDs. A unique ID is required for multi-GPU
> + // systems; without it, probe() fails for all but the first GPU.
> + let aux_id = AUXILIARY_ID_COUNTER.fetch_add(1, Ordering::Relaxed);
> +
> Ok(try_pin_init!(Self {
> gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
> _reg <- auxiliary::Registration::new(
> pdev.as_ref(),
> c"nova-drm",
> - 0, // TODO[XARR]: Once it lands, use XArray; for now we don't use the ID.
> + aux_id,
> crate::MODULE_NAME
> ),
> }))
>
> base-commit: 0f8a890c4524d6e4013ff225e70de2aed7e6d726
© 2016 - 2026 Red Hat, Inc.