From: Jiri Pirko <jiri@nvidia.com>
Document shared devlink instances for multiple PFs on the same chip.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
.../networking/devlink/devlink-shared.rst | 66 +++++++++++++++++++
Documentation/networking/devlink/index.rst | 1 +
2 files changed, 67 insertions(+)
create mode 100644 Documentation/networking/devlink/devlink-shared.rst
diff --git a/Documentation/networking/devlink/devlink-shared.rst b/Documentation/networking/devlink/devlink-shared.rst
new file mode 100644
index 000000000000..be9dd6f295df
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-shared.rst
@@ -0,0 +1,66 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+Devlink Shared Instances
+============================
+
+Overview
+========
+
+Shared devlink instances allow multiple physical functions (PFs) on the same
+chip to share an additional devlink instance for chip-wide operations. This
+should be implemented within individual drivers alongside the individual PF
+devlink instances, not replacing them.
+
+The shared devlink instance should be backed by a faux device and should
+provide a common interface for operations that affect the entire chip
+rather than individual PFs.
+
+Implementation
+==============
+
+Architecture
+------------
+
+The implementation should use:
+
+* **Faux device**: Virtual device backing the shared devlink instance
+* **Chip identification**: PFs are grouped by chip using a driver-specific identifier
+* **Shared instance management**: Global list of shared instances with reference counting
+
+Initialization Flow
+-------------------
+
+1. **PF calls shared devlink init** during driver probe
+2. **Chip identification** using driver-specific method to determine device identity
+3. **Lookup existing shared instance** for this chip identifier
+4. **Create new shared instance** if none exists:
+
+ * Create faux device with chip identifier as name
+ * Allocate and register devlink instance
+ * Add to global shared instances list
+
+5. **Add PF to shared instance** PF list
+6. **Set nested devlink instance** for the PF devlink instance
+
+Cleanup Flow
+------------
+
+1. **Cleanup** when PF is removed; destroy shared instance when last PF is removed
+
+Chip Identification
+-------------------
+
+PFs belonging to the same chip are identified using a driver-specific method.
+The driver is free to choose any identifier that is suitable for determining
+whether two PFs are part of the same device. Examples include VPD serial numbers,
+device tree properties, or other hardware-specific identifiers.
+
+Locking
+-------
+
+A global per-driver mutex protects the shared instances list and individual shared
+instance PF lists during registration/deregistration.
+
+Similarly to other nested devlink instance relationships, devlink lock of
+the shared instance should be always taken after the devlink lock of PF.
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index 35b12a2bfeba..f7ba7dcf477d 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -68,6 +68,7 @@ general.
devlink-resource
devlink-selftests
devlink-trap
+ devlink-shared
Driver-specific documentation
-----------------------------
--
2.31.1
On Tue, 25 Nov 2025 22:06:01 +0200 Tariq Toukan wrote: > From: Jiri Pirko <jiri@nvidia.com> > > Document shared devlink instances for multiple PFs on the same chip. > > Signed-off-by: Jiri Pirko <jiri@nvidia.com> > Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> > Signed-off-by: Tariq Toukan <tariqt@nvidia.com> > --- > .../networking/devlink/devlink-shared.rst | 66 +++++++++++++++++++ > Documentation/networking/devlink/index.rst | 1 + > 2 files changed, 67 insertions(+) > create mode 100644 Documentation/networking/devlink/devlink-shared.rst > > diff --git a/Documentation/networking/devlink/devlink-shared.rst b/Documentation/networking/devlink/devlink-shared.rst > new file mode 100644 > index 000000000000..be9dd6f295df > --- /dev/null > +++ b/Documentation/networking/devlink/devlink-shared.rst > @@ -0,0 +1,66 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +============================ > +Devlink Shared Instances > +============================ > + > +Overview > +======== > + > +Shared devlink instances allow multiple physical functions (PFs) on the same > +chip to share an additional devlink instance for chip-wide operations. This > +should be implemented within individual drivers alongside the individual PF > +devlink instances, not replacing them. > + > +The shared devlink instance should be backed by a faux device and should > +provide a common interface for operations that affect the entire chip > +rather than individual PFs. If we go with this we must state very clearly that this is a crutch and _not_ the recommended configuration... > +Implementation > +============== > + > +Architecture > +------------ > + > +The implementation should use: > + > +* **Faux device**: Virtual device backing the shared devlink instance > +* **Chip identification**: PFs are grouped by chip using a driver-specific identifier > +* **Shared instance management**: Global list of shared instances with reference counting > + > +Initialization Flow > +------------------- > + > +1. **PF calls shared devlink init** during driver probe > +2. **Chip identification** using driver-specific method to determine device identity > +3. **Lookup existing shared instance** for this chip identifier > +4. **Create new shared instance** if none exists: > + > + * Create faux device with chip identifier as name > + * Allocate and register devlink instance > + * Add to global shared instances list > + > +5. **Add PF to shared instance** PF list > +6. **Set nested devlink instance** for the PF devlink instance ... because presumably we could use this infra to manage a single devlink instance? Which is what I asked for initially. > +Cleanup Flow > +------------ > + > +1. **Cleanup** when PF is removed; destroy shared instance when last PF is removed > + > +Chip Identification > +------------------- > + > +PFs belonging to the same chip are identified using a driver-specific method. > +The driver is free to choose any identifier that is suitable for determining > +whether two PFs are part of the same device. Examples include VPD serial numbers, > +device tree properties, or other hardware-specific identifiers. > + > +Locking > +------- > + > +A global per-driver mutex protects the shared instances list and individual shared > +instance PF lists during registration/deregistration. Why can't this mutex live in the core? > +Similarly to other nested devlink instance relationships, devlink lock of > +the shared instance should be always taken after the devlink lock of PF. > diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst > index 35b12a2bfeba..f7ba7dcf477d 100644 > --- a/Documentation/networking/devlink/index.rst > +++ b/Documentation/networking/devlink/index.rst > @@ -68,6 +68,7 @@ general. > devlink-resource > devlink-selftests > devlink-trap > + devlink-shared > > Driver-specific documentation > -----------------------------
Fri, Nov 28, 2025 at 05:16:45AM +0100, kuba@kernel.org wrote: >On Tue, 25 Nov 2025 22:06:01 +0200 Tariq Toukan wrote: >> From: Jiri Pirko <jiri@nvidia.com> >> >> Document shared devlink instances for multiple PFs on the same chip. >> >> Signed-off-by: Jiri Pirko <jiri@nvidia.com> >> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> >> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> >> --- >> .../networking/devlink/devlink-shared.rst | 66 +++++++++++++++++++ >> Documentation/networking/devlink/index.rst | 1 + >> 2 files changed, 67 insertions(+) >> create mode 100644 Documentation/networking/devlink/devlink-shared.rst >> >> diff --git a/Documentation/networking/devlink/devlink-shared.rst b/Documentation/networking/devlink/devlink-shared.rst >> new file mode 100644 >> index 000000000000..be9dd6f295df >> --- /dev/null >> +++ b/Documentation/networking/devlink/devlink-shared.rst >> @@ -0,0 +1,66 @@ >> +.. SPDX-License-Identifier: GPL-2.0 >> + >> +============================ >> +Devlink Shared Instances >> +============================ >> + >> +Overview >> +======== >> + >> +Shared devlink instances allow multiple physical functions (PFs) on the same >> +chip to share an additional devlink instance for chip-wide operations. This >> +should be implemented within individual drivers alongside the individual PF >> +devlink instances, not replacing them. >> + >> +The shared devlink instance should be backed by a faux device and should >> +provide a common interface for operations that affect the entire chip >> +rather than individual PFs. > >If we go with this we must state very clearly that this is a crutch and >_not_ the recommended configuration... Why "not recommented". If there is a usecase for this in a dirrerent driver, it is probably good to utilize the shared instance, isn't it? Perhaps I'm missing something. > >> +Implementation >> +============== >> + >> +Architecture >> +------------ >> + >> +The implementation should use: >> + >> +* **Faux device**: Virtual device backing the shared devlink instance >> +* **Chip identification**: PFs are grouped by chip using a driver-specific identifier >> +* **Shared instance management**: Global list of shared instances with reference counting >> + >> +Initialization Flow >> +------------------- >> + >> +1. **PF calls shared devlink init** during driver probe >> +2. **Chip identification** using driver-specific method to determine device identity >> +3. **Lookup existing shared instance** for this chip identifier >> +4. **Create new shared instance** if none exists: >> + >> + * Create faux device with chip identifier as name >> + * Allocate and register devlink instance >> + * Add to global shared instances list >> + >> +5. **Add PF to shared instance** PF list >> +6. **Set nested devlink instance** for the PF devlink instance > >... because presumably we could use this infra to manage a single >devlink instance? Which is what I asked for initially. I'm not sure I follow. If there is only one PF bound, there is 1:1 relationship. Depends on how many PFs of the same ASIC you have. > >> +Cleanup Flow >> +------------ >> + >> +1. **Cleanup** when PF is removed; destroy shared instance when last PF is removed >> + >> +Chip Identification >> +------------------- >> + >> +PFs belonging to the same chip are identified using a driver-specific method. >> +The driver is free to choose any identifier that is suitable for determining >> +whether two PFs are part of the same device. Examples include VPD serial numbers, >> +device tree properties, or other hardware-specific identifiers. >> + >> +Locking >> +------- >> + >> +A global per-driver mutex protects the shared instances list and individual shared >> +instance PF lists during registration/deregistration. > >Why can't this mutex live in the core? Well, the mutex protect the list of instances which are managed in the driver. If you want to move the mutex, I don't see how to do it without moving all the code related to shared devlink instances, including faux probe etc. Is that what you suggest? > >> +Similarly to other nested devlink instance relationships, devlink lock of >> +the shared instance should be always taken after the devlink lock of PF. >> diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst >> index 35b12a2bfeba..f7ba7dcf477d 100644 >> --- a/Documentation/networking/devlink/index.rst >> +++ b/Documentation/networking/devlink/index.rst >> @@ -68,6 +68,7 @@ general. >> devlink-resource >> devlink-selftests >> devlink-trap >> + devlink-shared >> >> Driver-specific documentation >> ----------------------------- >
On Fri, 28 Nov 2025 12:00:13 +0100 Jiri Pirko wrote: > >> +Shared devlink instances allow multiple physical functions (PFs) on the same > >> +chip to share an additional devlink instance for chip-wide operations. This > >> +should be implemented within individual drivers alongside the individual PF > >> +devlink instances, not replacing them. > >> + > >> +The shared devlink instance should be backed by a faux device and should > >> +provide a common interface for operations that affect the entire chip > >> +rather than individual PFs. > > > >If we go with this we must state very clearly that this is a crutch and > >_not_ the recommended configuration... > > Why "not recommented". If there is a usecase for this in a dirrerent > driver, it is probably good to utilize the shared instance, isn't it? > Perhaps I'm missing something. Having a single instance seems preferable from user's point of view. > >... because presumably we could use this infra to manage a single > >devlink instance? Which is what I asked for initially. > > I'm not sure I follow. If there is only one PF bound, there is 1:1 > relationship. Depends on how many PFs of the same ASIC you have. I'm talking about multi-PF devices. mlx5 supports multi-PF setup for NUMA locality IIUC. In such configurations per-PF parameters can be configured on PCI PF ports. > >Why can't this mutex live in the core? > > Well, the mutex protect the list of instances which are managed in the > driver. If you want to move the mutex, I don't see how to do it without > moving all the code related to shared devlink instances, including faux > probe etc. Is that what you suggest? Multiple ways you can solve it, but drivers should have to duplicate all the instance management and locking. BTW please don't use guard().
Sat, Nov 29, 2025 at 04:19:24AM +0100, kuba@kernel.org wrote: >On Fri, 28 Nov 2025 12:00:13 +0100 Jiri Pirko wrote: >> >> +Shared devlink instances allow multiple physical functions (PFs) on the same >> >> +chip to share an additional devlink instance for chip-wide operations. This >> >> +should be implemented within individual drivers alongside the individual PF >> >> +devlink instances, not replacing them. >> >> + >> >> +The shared devlink instance should be backed by a faux device and should >> >> +provide a common interface for operations that affect the entire chip >> >> +rather than individual PFs. >> > >> >If we go with this we must state very clearly that this is a crutch and >> >_not_ the recommended configuration... >> >> Why "not recommented". If there is a usecase for this in a dirrerent >> driver, it is probably good to utilize the shared instance, isn't it? >> Perhaps I'm missing something. > >Having a single instance seems preferable from user's point of view. Sure, if there is no need for sharing, correct. > >> >... because presumably we could use this infra to manage a single >> >devlink instance? Which is what I asked for initially. >> >> I'm not sure I follow. If there is only one PF bound, there is 1:1 >> relationship. Depends on how many PFs of the same ASIC you have. > >I'm talking about multi-PF devices. mlx5 supports multi-PF setup for >NUMA locality IIUC. In such configurations per-PF parameters can be >configured on PCI PF ports. Correct. IFAIK there is one PF devlink instance per NUMA node. The shared instance on top would make sense to me. That was one of motivations to introduce it. Then this shared instance would hold netdev, vf representors etc. > >> >Why can't this mutex live in the core? >> >> Well, the mutex protect the list of instances which are managed in the >> driver. If you want to move the mutex, I don't see how to do it without >> moving all the code related to shared devlink instances, including faux >> probe etc. Is that what you suggest? > >Multiple ways you can solve it, but drivers should have to duplicate >all the instance management and locking. BTW please don't use guard(). I'm having troubles to undestand what you say, sorry :/ Do you prefer to move the code from driver to devlink core or not? Regarding guard(), sure. I wonder how much more time it's gonna take since this resistentance fades out :)
On Mon, 1 Dec 2025 11:50:08 +0100 Jiri Pirko wrote: > >> I'm not sure I follow. If there is only one PF bound, there is 1:1 > >> relationship. Depends on how many PFs of the same ASIC you have. > > > >I'm talking about multi-PF devices. mlx5 supports multi-PF setup for > >NUMA locality IIUC. In such configurations per-PF parameters can be > >configured on PCI PF ports. > > Correct. IFAIK there is one PF devlink instance per NUMA node. You say "correct" and then disagree with what I'm saying. I said ports because a port is a devlink object. Not a devlink instance. > The shared instance on top would make sense to me. That was one of > motivations to introduce it. Then this shared instance would hold > netdev, vf representors etc. I don't understand what the shared instance is representing and how user is expect to find their way thru the maze of devlink instanced, for real bus, aux bus, and now shared instanced. > >> Well, the mutex protect the list of instances which are managed in the > >> driver. If you want to move the mutex, I don't see how to do it without > >> moving all the code related to shared devlink instances, including faux > >> probe etc. Is that what you suggest? > > > >Multiple ways you can solve it, but drivers should have to duplicate > >all the instance management and locking. BTW please don't use guard(). > > I'm having troubles to undestand what you say, sorry :/ Do you prefer to > move the code from driver to devlink core or not? I missed a "not".. drivers should _not_ have to duplicate, sorry. > Regarding guard(), sure. I wonder how much more time it's gonna take > since this resistentance fades out :) guard() locks code instead of data accesses. We used to make fun of Java in this community, you know.
© 2016 - 2025 Red Hat, Inc.