Documentation/netlink/specs/ethtool.yaml | 27 ++ .../ethernet/mellanox/mlx5/core/en_ethtool.c | 52 ++- .../ethernet/mellanox/mlx5/core/mlx5_core.h | 6 +- .../net/ethernet/mellanox/mlx5/core/port.c | 34 +- include/linux/ethtool.h | 12 + .../uapi/linux/ethtool_netlink_generated.h | 22 ++ net/ethtool/module.c | 302 ++++++++++++++++-- net/ethtool/netlink.h | 2 +- 8 files changed, 397 insertions(+), 60 deletions(-)
Hi!
Background
==========
This series adds initial ethtool support for CMIS loopback.
The Common Management Interface Specification (CMIS) is an
industry-standard used by host devices (like switches and routers) to
talk to high-speed optical transceivers (like QSFP-DD, OSFP, and
QSFP112).
Ethtool already supports mechanism updating the transceiver firmware
via CMIS, and this series builds on top of this work.
In CMIS, four different types of loopback are defined by the
specification, characterized by the location of the loopback on Host
(electrical) or Media (optical) Side and the direction of the signal
being looped-back:
* Media Side Output (Tx->Rx) Loopback
* Media Side Input (Rx->Tx) Loopback
* Host Side Output (Tx->Rx) Loopback
* Host Side Input (Rx->Tx) Loopback
To detect and enable loopback in a CMIS transceiver, the following
registers are used:
* Detect Support: Read Page 13h, Byte 128 indicate if the hardware
supports host-side or media-side loopback.
* Enable Loopback: Write to Page 13h, Bytes 180–184. Each bit in these
registers typically corresponds to a specific lane (0–7). Setting a
bit to 1 requests loopback for that lane.
Implementation
==============
Patch 1/4 ethtool: module: Define CMIS loopback YAML spec and UAPI
Adds the netlink YAML specification and UAPI for module loopback.
Defines a flags enum with the four CMIS 5.2 diagnostic loopback
types (media-side output/input, host-side output/input) and two new
module attributes: loopback-capabilities (supported modes) and
loopback-enabled (active modes). Regenerates the UAPI header.
Patch 2/4 ethtool: module: Add CMIS loopback GET/SET support
Implements the core loopback GET/SET logic for CMIS modules. Reads
capabilities from Page 01h Byte 142 and controls loopback via Page
13h Bytes 180-183, using the existing get/set_module_eeprom_by_page
driver ops. No new ethtool_ops callbacks are introduced.
Patch 3/4 ethtool: module: refactor fw flash init to reuse CMIS
helpers Refactors module_flash_fw_work_init() to reuse the
module_is_cmis() helper and ethtool_cmis_page_init() introduced in
patch 2, removing open-coded CMIS type checking and manual EEPROM
page setup from the firmware flash path.
Patch 4/4 net/mlx5e: Implement set_module_eeprom_by_page ethtool
callback Adds EEPROM write support to mlx5 by implementing
set_module_eeprom_by_page, mirroring the existing read path via the
MCIA register. This enables the loopback SET path which requires
both get and set callbacks.
Limitations
===========
Only four modes are supported host/media-side near-/far-end. No
per-lane support.
I'm working on kselftest; It's not part of the RFC.
RFC
===
I'm not familiar with the mlx5 internals, and need guidance if my
set_module_eeprom_by_page() hack is the right way forward. I've tested
this on a transceiver in a CX7 NIC, and it did switch on the loopback
mode, so it's somewhat working.
I'd like input from other NIC vendors, if the
{set,get}_module_eeprom_by_page() is the right interface/ops from a
driver POV.
Extensibility; Is this the right interface?
Related work
============
* New loopback modes [1].
* PHY loopback [2]
* bnxt_en: add .set_module_eeprom_by_page() support [3]
* ethtool: qsfp transceiver reset, interrupt and presence pin control
[4]
[1] https://lore.kernel.org/netdev/20251024044849.1098222-1-hkelam@marvell.com/
[2] https://lore.kernel.org/netdev/20240911212713.2178943-1-maxime.chevallier@bootlin.com/
[3] https://lore.kernel.org/netdev/20250310183129.3154117-8-michael.chan@broadcom.com/
[4] https://lore.kernel.org/netdev/20250513224017.202236-1-mpazdan@arista.com/
Björn Töpel (4):
ethtool: module: Define CMIS loopback YAML spec and UAPI
ethtool: module: Add CMIS loopback GET/SET support
ethtool: module: refactor fw flash init to reuse CMIS helpers
net/mlx5e: Implement set_module_eeprom_by_page ethtool callback
Documentation/netlink/specs/ethtool.yaml | 27 ++
.../ethernet/mellanox/mlx5/core/en_ethtool.c | 52 ++-
.../ethernet/mellanox/mlx5/core/mlx5_core.h | 6 +-
.../net/ethernet/mellanox/mlx5/core/port.c | 34 +-
include/linux/ethtool.h | 12 +
.../uapi/linux/ethtool_netlink_generated.h | 22 ++
net/ethtool/module.c | 302 ++++++++++++++++--
net/ethtool/netlink.h | 2 +-
8 files changed, 397 insertions(+), 60 deletions(-)
base-commit: 37a93dd5c49b5fda807fd204edf2547c3493319c
--
2.53.0
On Thu, Feb 19, 2026 at 02:00:41PM +0100, Björn Töpel wrote: > Hi! > > Background > ========== > > This series adds initial ethtool support for CMIS loopback. > Related work > ============ > > * New loopback modes [1]. > * PHY loopback [2] Hi Björn Great to see you looked around at the problem space. > [2] https://lore.kernel.org/netdev/20240911212713.2178943-1-maxime.chevallier@bootlin.com/ Quoting myself from this thread: > We might want to take a step back and think about loopback some more. > > Loopback can be done at a number of points in the device(s). Some > Marvell PHYs can do loopback in the PHY PCS layer. Some devices also > support loopback in the PHY SERDES layer. I've not seen it for Marvell > devices, but maybe some PHYs allow loopback much closer to the line? > And i expect some MAC PCS allow loopback. > > So when talking about loopback, we might also want to include the > concept of where the loopback occurs, and maybe it needs to be a NIC > wide concept, not a PHY concept? I still think this is true. We want a generic kAPI for loopback, not a PHY loopback kAPI, and a MAC loopback kAPI, a PCS loopback kAPI, and an SFP loopback kAPI, and a CAN bus transceiver loopback kAPI, assuming CAN bus supports loopback? So i think we need one ethtool API for loopback. We probably want an API call to enumerate what loopbacks are supported for a netdev. The MAC will fill in bits indicating what it can do. If the MAC has a PCS, it will ask the PCS what it can do. If there is a PHY, it will ask the PHY to fill in the bits indicating what it can do, if there is an SFP, it will ask it what it can do, and if there is a CAN bus transceiver, it will fill in its bits. And we probably want two values for each loopback location, is it looping the media side, or the MAC side? So the return value lists all the different loopbacks associated to a netdev. And then we need a set operation, to enable/disable a specific loopback, and a get operation to return the status of all the different loopbacks of a netdev. The MAC will again need to call into the PCS, the PHY, the SFP to implement these. I'm not saying you need to implement all these, you just need to make what you do implement generic, and plumb it through the network stack so that others can later easily add PHY, PCS, and MAC loopback support. And from your background research, you know others are interested in this, so you might be able get some help with parts you are not particularly interested in. Andrew
Andrew! Thanks for the swift response! On Thu, 19 Feb 2026 at 16:51, Andrew Lunn <andrew@lunn.ch> wrote: > Great to see you looked around at the problem space. > > > [2] https://lore.kernel.org/netdev/20240911212713.2178943-1-maxime.chevallier@bootlin.com/ > > Quoting myself from this thread: > > > We might want to take a step back and think about loopback some more. > > > > Loopback can be done at a number of points in the device(s). Some > > Marvell PHYs can do loopback in the PHY PCS layer. Some devices also > > support loopback in the PHY SERDES layer. I've not seen it for Marvell > > devices, but maybe some PHYs allow loopback much closer to the line? > > And i expect some MAC PCS allow loopback. > > > > So when talking about loopback, we might also want to include the > > concept of where the loopback occurs, and maybe it needs to be a NIC > > wide concept, not a PHY concept? > > I still think this is true. We want a generic kAPI for loopback, not a > PHY loopback kAPI, and a MAC loopback kAPI, a PCS loopback kAPI, and > an SFP loopback kAPI, and a CAN bus transceiver loopback kAPI, > assuming CAN bus supports loopback? > > So i think we need one ethtool API for loopback. We probably want an > API call to enumerate what loopbacks are supported for a netdev. The > MAC will fill in bits indicating what it can do. If the MAC has a PCS, > it will ask the PCS what it can do. If there is a PHY, it will ask the > PHY to fill in the bits indicating what it can do, if there is an SFP, > it will ask it what it can do, and if there is a CAN bus transceiver, > it will fill in its bits. And we probably want two values for each > loopback location, is it looping the media side, or the MAC side? > > So the return value lists all the different loopbacks associated to a > netdev. > > And then we need a set operation, to enable/disable a specific > loopback, and a get operation to return the status of all the > different loopbacks of a netdev. The MAC will again need to call into > the PCS, the PHY, the SFP to implement these. > > I'm not saying you need to implement all these, you just need to make > what you do implement generic, and plumb it through the network stack > so that others can later easily add PHY, PCS, and MAC loopback > support. And from your background research, you know others are > interested in this, so you might be able get some help with parts you > are not particularly interested in. All good points here; Thanks for the elaborative feedback. I like the idea of a generic loopback API. Back to the drawing board! Björn
On Thu, 19 Feb 2026 16:51:47 +0100 Andrew Lunn wrote:
> > We might want to take a step back and think about loopback some more.
> >
> > Loopback can be done at a number of points in the device(s). Some
> > Marvell PHYs can do loopback in the PHY PCS layer. Some devices also
> > support loopback in the PHY SERDES layer. I've not seen it for Marvell
> > devices, but maybe some PHYs allow loopback much closer to the line?
> > And i expect some MAC PCS allow loopback.
> >
> > So when talking about loopback, we might also want to include the
> > concept of where the loopback occurs, and maybe it needs to be a NIC
> > wide concept, not a PHY concept?
>
> I still think this is true. We want a generic kAPI for loopback, not a
> PHY loopback kAPI, and a MAC loopback kAPI, a PCS loopback kAPI, and
> an SFP loopback kAPI, and a CAN bus transceiver loopback kAPI,
> assuming CAN bus supports loopback?
>
> So i think we need one ethtool API for loopback. We probably want an
> API call to enumerate what loopbacks are supported for a netdev. The
> MAC will fill in bits indicating what it can do. If the MAC has a PCS,
> it will ask the PCS what it can do. If there is a PHY, it will ask the
> PHY to fill in the bits indicating what it can do, if there is an SFP,
> it will ask it what it can do, and if there is a CAN bus transceiver,
> it will fill in its bits. And we probably want two values for each
> loopback location, is it looping the media side, or the MAC side?
>
> So the return value lists all the different loopbacks associated to a
> netdev.
>
> And then we need a set operation, to enable/disable a specific
> loopback, and a get operation to return the status of all the
> different loopbacks of a netdev. The MAC will again need to call into
> the PCS, the PHY, the SFP to implement these.
>
> I'm not saying you need to implement all these, you just need to make
> what you do implement generic, and plumb it through the network stack
> so that others can later easily add PHY, PCS, and MAC loopback
> support. And from your background research, you know others are
> interested in this, so you might be able get some help with parts you
> are not particularly interested in.
Something like:
struct {
enum type; // MAC, PHY, SFP
int type_id; // if type=PHY - phy id
int depth; // counting from CPU, first loopback opportunity is 1
// second is 2, etc.
bool direction; // towards CPU/host vs towards network
char name[16]; // "pcs", "far", "near", "post-fec", whatever
}
?
> Something like:
>
> struct {
> enum type; // MAC, PHY, SFP
> int type_id; // if type=PHY - phy id
> int depth; // counting from CPU, first loopback opportunity is 1
> // second is 2, etc.
> bool direction; // towards CPU/host vs towards network
> char name[16]; // "pcs", "far", "near", "post-fec", whatever
> }
Lets see what comes from the drawing board, but i was more thinking
about expanding the bitmap this proposal already has, extending it to
other layers. As use cases are implemented, we define the bits needed
in the map. The ethtool kAPI has the needed infrastructure to map bits
to names, it is used for link modes etc, and that can be used here. So
the ethtool(1) part should be reasonably generic.
Andrew
On Fri, 20 Feb 2026 15:18:40 +0100 Andrew Lunn wrote:
> > Something like:
> >
> > struct {
> > enum type; // MAC, PHY, SFP
> > int type_id; // if type=PHY - phy id
> > int depth; // counting from CPU, first loopback opportunity is 1
> > // second is 2, etc.
> > bool direction; // towards CPU/host vs towards network
> > char name[16]; // "pcs", "far", "near", "post-fec", whatever
> > }
>
> Lets see what comes from the drawing board, but i was more thinking
> about expanding the bitmap this proposal already has, extending it to
> other layers.
IIUC the bitmap this proposal has is basically a product of
direction x depth: [host, network] x [nearest, furthest]
plus its scoped to SFP.
> As use cases are implemented, we define the bits needed
> in the map.
Sure, but if we are creating a dedicated API we should decompose
the information from the start. Direction, and entity (MAC, PHY, SFP)
don't have to be part of the bitmap?
> The ethtool kAPI has the needed infrastructure to map bits
> to names, it is used for link modes etc, and that can be used here. So
> the ethtool(1) part should be reasonably generic.
Dunno if link modes are the right point of reference. Link mode is
a combination of various parameters which must match on both sides
exactly. For the loopback the config is very simple, the expressiveness
is needed to explain where the configuration is applied.
IOW for link modes it's important to have an ID for the combination of
all params to easily check if the whole thing is as expected.
For loopback it's easier to think of it as traversing attribute by
attribute: MAC / PHY / SFP -> which one -> which depth -> which dir.
Single id has no benefit and would be cumbersome to define.
Or at least that's my intuition, I haven't use loopback much myself.
Andrew/Kuba!
Thanks for having a look!
On Fri, 20 Feb 2026 at 22:12, Jakub Kicinski <kuba@kernel.org> wrote:
>> > Something like:
>> >
>> > struct {
>> > enum type; // MAC, PHY, SFP
>> > int type_id; // if type=PHY - phy id
>> > int depth; // counting from CPU, first loopback opportunity is 1
>> > // second is 2, etc.
>> > bool direction; // towards CPU/host vs towards network
>> > char name[16]; // "pcs", "far", "near", "post-fec", whatever
>> > }
>>
>> Lets see what comes from the drawing board, but i was more thinking
>> about expanding the bitmap this proposal already has, extending it to
>> other layers.
>
>IIUC the bitmap this proposal has is basically a product of
>direction x depth: [host, network] x [nearest, furthest]
>plus its scoped to SFP.
(Digging into different ways to loopback, rather than drawing board,
but here goes! ;-))
I'd agree with Jakub here, unless I'm not getting the details of what
you mean, Andrew. Sounds like we'd end up with a huge bitmap? I
suggest tweaking Jakub's idea to something like:
/* loopback.c: A new ETHTOOL_MSG_LOOPBACK_{GET,SET} */
/* Loopback layers/scope
enum ethtool_loopback_layer {
ETHTOOL_LB_LAYER_SW = 0, /* Software/Kernel stack loopback */
ETHTOOL_LB_LAYER_MAC, /* MAC/Controller internal */
ETHTOOL_LB_LAYER_PCS, /* Physical Coding Sublayer (Digital) */
ETHTOOL_LB_LAYER_PMA, /* SerDes / Analog-Digital boundary */
ETHTOOL_LB_LAYER_PMD, /* Transceiver / Module internal */
ETHTOOL_LB_LAYER_EXT, /* External physical plug/cable */
};
/* Loopback Direction (XXX is local/remote easier to understand?) */
enum ethtool_loopback_dir {
ETHTOOL_LB_DIR_NEAR_END = 0, /* Host -> Loop -> Host */
ETHTOOL_LB_DIR_FAR_END, /* Line -> Loop -> Line */
};
struct ethtool_loopback_layer_cfg {
enum ethtool_loopback_layer layer; /* ETHTOOL_LB_L_MAC, etc. */
enum ethtool_loopback_dir direction; /* NEAR or FAR */
u32 lane_mask; /* Specific lanes */
u32 flags; /* patterns? reserved... */
bool enabled;
char name[16];
};
struct ethtool_loopback_cfg {
struct ethtool_loopback_layer_cfg *entries;
u32 num_layers;
};
struct ethtool_ops {
/* ... */
/* Query which layers/lane-combos are physically possible */
int (*get_loopback_caps)(struct net_device *dev,
struct ethtool_loopback_cfg *caps);
/* Get current active status for all layers */
int (*get_loopback_state)(struct net_device *dev,
struct ethtool_loopback_cfg *state);
/* Set one or more layer/lane configurations atomically */
int (*set_loopback)(struct net_device *dev,
const struct ethtool_loopback_cfg *cfg,
struct netlink_ext_ack *extack);
};
As for layers; EXT vs PMD? EXT could be a loopback plug, whereas PMD
would be CMIS, or whatever the driver detects.
Userland would be something like:
# ethtool --show-loopback eth0
Loopback Status for eth0:
Layer: SW | State: OFF
Layer: MAC | State: OFF
Layer: PMA | State: ON | Lanes: 0x1 (Lane 0) | Direction: Near-End (Local)
Layer: PMD | State: ON | Lanes: 0xF (All) | Direction: Far-End (Remote)
Layer: EXT | State: ON | Detected: External Loopback Plug
# ethtool --set-loopback <dev> [layer[:lanes][:direction]] ... [off]
# # Simple MAC loopback:
# ethtool --set-loopback eth0 mac (Defaults: lanes=all, dir=near)
# # Specific SerDes (PMA) lane:
# ethtool --set-loopback eth0 pma:lane0
# # Complex multi-layer (PMA Near + PMD Far):
# ethtool --set-loopback eth0 pma:0x1:near pmd:all:far
# # Disable all loopbacks:
ethtool --set-loopback eth0 off
Thoughts? Is this somewhat close to what you had in mind, Andrew?
I'm far from an expert on the details here, so the folks with more
knowledge, please chime in!
Cheers,
Björn
Hi Björn, On 22/02/2026 20:58, Björn Töpel wrote: > # # Simple MAC loopback: > # ethtool --set-loopback eth0 mac (Defaults: lanes=all, dir=near) > # # Specific SerDes (PMA) lane: > # ethtool --set-loopback eth0 pma:lane0 > # # Complex multi-layer (PMA Near + PMD Far): > # ethtool --set-loopback eth0 pma:0x1:near pmd:all:far > # # Disable all loopbacks: > ethtool --set-loopback eth0 off > > Thoughts? Is this somewhat close to what you had in mind, Andrew? > > I'm far from an expert on the details here, so the folks with more > knowledge, please chime in! I'm very sorry not to have looked into this yet, I'm having some family events to handle right-now but I hope to be back next monday to take a close look at this :) Thanks for this work, Maxime
Maxime! On Wed, 25 Feb 2026 at 11:22, Maxime Chevallier <maxime.chevallier@bootlin.com> wrote: > I'm very sorry not to have looked into this yet, I'm having some family > events to handle right-now but I hope to be back next monday to take a > close look at this :) No rush, and thank you! Take care of your family -- the patches will still be around when you're back! Cheers, Björn
On Sun, 22 Feb 2026 20:58:30 +0100 Björn Töpel wrote:
> /* Loopback layers/scope
> enum ethtool_loopback_layer {
> ETHTOOL_LB_LAYER_SW = 0, /* Software/Kernel stack loopback */
What would that be? :)
> ETHTOOL_LB_LAYER_MAC, /* MAC/Controller internal */
> ETHTOOL_LB_LAYER_PCS, /* Physical Coding Sublayer (Digital) */
> ETHTOOL_LB_LAYER_PMA, /* SerDes / Analog-Digital boundary */
> ETHTOOL_LB_LAYER_PMD, /* Transceiver / Module internal */
In my mind the "layer" was supposed to tell core which driver to send
the request to. Same concept is used in the timestamp source selection.
PCS/PMA/PMD is both too fine grained when you have multiple PHYs in the
path, and does not cover all the possible loopback points.
> ETHTOOL_LB_LAYER_EXT, /* External physical plug/cable */
is EXT used somewhere to mean SFP already?
Good feedback, Kuba! ...more thinking...
On Tue, 24 Feb 2026 at 00:04, Jakub Kicinski <kuba@kernel.org> wrote:
...
As a reference, there's a v2-ish version here (API I suggested +
netdevsim plumbing) [1].
What I discovered was that the "layer"/IEEE abstraction I suggested
didn't even fit nicely with the CMIS module loopback work -- I had to
split PMD into PMD-HOST/PMD-MODULE to fit the CMIS spec, which felt
like a hack.
I've obviously taken too high-level a "PCIe device" view of the world.
Looking at Maxime's slides from netdev 0x17 [2] reveals a more
complicated world; multiple PHY-like things in the path (onboard PHY,
module, external PHY, ...). My model doesn't cover what Maxime
outlines.
Instead, like Jakub hinted at, userspace should reason in terms of
"loopback entities" (or IDs) that correspond to "places where loopback
can be applied”, not in terms of PCS/PMA/PMD topology.
A loopback entity would have:
an ID / type (which matches a driver),
a direction,
an optional descriptive name.
Reiterating what Jakub said earlier, using hwstamp-source as a
precedent, you could imagine something like:
struct {
enum type; // MAC, PHY, SFP
int type_id; // if type=PHY - phy id
int depth; // counting from CPU, first loopback opportunity is 1
// second is 2, etc.
bool direction; // towards CPU/host vs towards network
char name[16]; // "pcs", "far", "near", "post-fec", whatever
}
I'm not sure that depth/name are actually useful here.
Concretely, that could look like this in ethtool:
/* What kind of kernel object owns this loopback. */
enum ethtool_loopback_owner {
ETHTOOL_LB_OWNER_MAC, /* struct net_device / MAC driver */
ETHTOOL_LB_OWNER_PHY, /* struct phy_device (phylib PHY) */
ETHTOOL_LB_OWNER_MODULE, /* module / SFP driver, e.g. CMIS */
};
enum ethtool_loopback_dir {
ETHTOOL_LB_DIR_HOST_LOOP_HOST,
ETHTOOL_LB_DIR_LINE_LOOP_LINE,
};
/* One "loopback entity" associated with the netdev. */
struct ethtool_loopback_entry {
u32 id; /* per-netdev */
enum ethtool_loopback_owner owner;
u32 depth; /* ??? */
enum ethtool_loopback_dir direction; /* enum ethtool_loopback_dir */
bool enabled;
char name[16]; /* ??? "mac",
"phy-pcs", "module-far", ... */
};
/* Aggregate for GET: list all possible endpoints + their status. */
struct ethtool_loopback_state {
struct ethtool_loopback_entry entries[16];
unsigned int n_entries;
};
From a driver POV:
* The MAC driver can add its own MAC-level loopback entity (owner =
MAC).
* If there is one or more phydevs, it can call into a phylib helper
(e.g. phy_get_loopback_state()) to append PHY entities (owner =
PHY).
* If there is a CMIS module, it can append its entities (owner = MODULE).
# ethtool --show-loopback eth0
Loopback endpoints for eth0:
id owner depth direction name enabled
0 mac 1 host->loop->host mac off
1 phy 2 host->loop->host phy-pcs on
2 module 3 line->loop->line cmis-far off
# Enable endpoint 2
ethtool --set-loopback eth0 id 2 on
Does this model make more sense, especially for the multi-PHY / CMIS
cases Maxime described?
Björn
[1] https://github.com/fb-bjorn/linux/compare/master...fb-bjorn:linux:ethtool-loopback-rfc-v2
[2] https://netdevconf.info/0x17/docs/netdev-0x17-paper2-talk-slides/multi-port-multi-phy-interfaces.pdf
> # ethtool --show-loopback eth0 > Loopback endpoints for eth0: > id owner depth direction name enabled > 0 mac 1 host->loop->host mac off > 1 phy 2 host->loop->host phy-pcs on > 2 module 3 line->loop->line cmis-far off > > # Enable endpoint 2 > ethtool --set-loopback eth0 id 2 on We have to be careful about what is ABI here. Initially you plan to implement module, so you will have: # ethtool --show-loopback eth0 Loopback endpoints for eth0: id owner depth direction name enabled 0 module 1 line->loop->line cmis-far off ethtool --set-loopback eth0 id 0 on And then somebody implements loopback at the mac: # ethtool --show-loopback eth0 Loopback endpoints for eth0: id owner depth direction name enabled 0 mac 1 host->loop->host mac off 1 module 2 line->loop->line cmis-far off You script doing ethtool --set-loopback eth0 id 0 on Suddenly does something else. Is this an ABI break? How do we make this reliable so implementing more loopbacks at different levels does not change how you use --set-loopback? Andrew
On Wed, 25 Feb 2026 at 05:05, Andrew Lunn <andrew@lunn.ch> wrote: > > > # ethtool --show-loopback eth0 > > Loopback endpoints for eth0: > > id owner depth direction name enabled > > 0 mac 1 host->loop->host mac off > > 1 phy 2 host->loop->host phy-pcs on > > 2 module 3 line->loop->line cmis-far off > > > > # Enable endpoint 2 > > ethtool --set-loopback eth0 id 2 on > > We have to be careful about what is ABI here. Initially you plan to > implement module, so you will have: > > # ethtool --show-loopback eth0 > Loopback endpoints for eth0: > id owner depth direction name enabled > 0 module 1 line->loop->line cmis-far off > > ethtool --set-loopback eth0 id 0 on > > And then somebody implements loopback at the mac: > > # ethtool --show-loopback eth0 > Loopback endpoints for eth0: > id owner depth direction name enabled > 0 mac 1 host->loop->host mac off > 1 module 2 line->loop->line cmis-far off > > You script doing > > ethtool --set-loopback eth0 id 0 on > > Suddenly does something else. Indeed! > Is this an ABI break? How do we make this reliable so implementing > more loopbacks at different levels does not change how you use > --set-loopback? Isn't this somewhat similar to what we have with ifindex/phy_index, but potentially unstable when modules are swapped/changed? Instead of ids, use string name and/or topology indices (e.g. phy_index)? All three -- owner, phy_index, name tuple? Björn
> > Suddenly does something else. > > Indeed! > > > Is this an ABI break? How do we make this reliable so implementing > > more loopbacks at different levels does not change how you use > > --set-loopback? > > Isn't this somewhat similar to what we have with ifindex/phy_index, > but potentially unstable when modules are swapped/changed? If you hot plug hardware, a new PHY pops into existence, i don't think it is too unreasonable for the hot plugable parts to change ids. I would however expect the fixed parts to keep there IDs. But here we are talking about software, a kernel upgrade/downgrade causing the IDs to change. > Instead of ids, use string name and/or topology indices (e.g. > phy_index)? All three -- owner, phy_index, name tuple? Probably. Andrew
Hello Andrew, BJörn, On 25/02/2026 14:14, Andrew Lunn wrote: >>> Suddenly does something else. >> >> Indeed! >> >>> Is this an ABI break? How do we make this reliable so implementing >>> more loopbacks at different levels does not change how you use >>> --set-loopback? >> >> Isn't this somewhat similar to what we have with ifindex/phy_index, >> but potentially unstable when modules are swapped/changed? > > If you hot plug hardware, a new PHY pops into existence, i don't think > it is too unreasonable for the hot plugable parts to change ids. I > would however expect the fixed parts to keep there IDs. That's indeed the phy index behaviour. > > But here we are talking about software, a kernel upgrade/downgrade > causing the IDs to change. > >> Instead of ids, use string name and/or topology indices (e.g. >> phy_index)? All three -- owner, phy_index, name tuple? The overall approach after all these discussions sounds fine to me, I do think that the index of the component that does the loopback needs to be there somewhere, when relevant. Either through a name string, or a combo of an enum indicating the component type (MAC/PHY/Module/etc.) + its index. I think it's safe to assume that indices will fit in u32 ? something like : # MAC PCS loopback ethtool --set-loopback eth0 loc mac name pcs # PHY id 2 PMA loopback (I'm making things up here) ethtool --set-loopback eth0 loc phy id 2 name pma That way we can extend that fairly easily for, say, combo-port devices where we could select which of the port we want to loopback :) Maxime
Hey! On Mon, 2 Mar 2026 at 10:01, Maxime Chevallier <maxime.chevallier@bootlin.com> wrote: > The overall approach after all these discussions sounds fine to me, I do > think that the index of the component that does the loopback needs to be > there somewhere, when relevant. > > Either through a name string, or a combo of an enum indicating the > component type (MAC/PHY/Module/etc.) + its index. I think it's safe to > assume that indices will fit in u32 ? > > something like : > > # MAC PCS loopback > ethtool --set-loopback eth0 loc mac name pcs > > # PHY id 2 PMA loopback (I'm making things up here) > ethtool --set-loopback eth0 loc phy id 2 name pma > > That way we can extend that fairly easily for, say, combo-port devices > where we could select which of the port we want to loopback :) Ok! I'll spin a new version with this in mind. To improve my mental model, could you give an example how you would use a combo-port from a userland perspective? Cheers, Björn
Hi Björn, On 04/03/2026 16:52, Björn Töpel wrote: > Hey! > > On Mon, 2 Mar 2026 at 10:01, Maxime Chevallier > <maxime.chevallier@bootlin.com> wrote: > >> The overall approach after all these discussions sounds fine to me, I do >> think that the index of the component that does the loopback needs to be >> there somewhere, when relevant. >> >> Either through a name string, or a combo of an enum indicating the >> component type (MAC/PHY/Module/etc.) + its index. I think it's safe to >> assume that indices will fit in u32 ? >> >> something like : >> >> # MAC PCS loopback >> ethtool --set-loopback eth0 loc mac name pcs >> >> # PHY id 2 PMA loopback (I'm making things up here) >> ethtool --set-loopback eth0 loc phy id 2 name pma >> >> That way we can extend that fairly easily for, say, combo-port devices >> where we could select which of the port we want to loopback :) > > Ok! I'll spin a new version with this in mind. To improve my mental > model, could you give an example how you would use a combo-port from a > userland perspective? Of course :) Considering this setup : +-----+ +-----+ | MAC | | PHY |----- SFP | |-----| |----- RJ45 +-----+ +-----+ It's still WIP but the current state of what I have in the pipe looks like : # List the ports ethtool --show-ports eth0 Port for eth10: # <- This port represents the RJ45 port of the PHY Port id: 1 Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full 10000baseT/Full 2500baseT/Full 5000baseT/Full Port type: mdi Active: yes Link: up Port for eth1: # <- This port represents the SFP cage Port id: 2 Vacant: no Supported MII interfaces : 10gbase-r Port type: sfp Active: no Port for eth1: # <- This port represents the SFP module inside the cage Port id: 4 Supported link modes: 10000baseCR/Full Port type: mdi Active: no Link: up # Select the SFP port as the active one (note that we could either use port 2 or 4 here for the same result) : ethtool --set-port eth0 id 4 active on I may add something like : ethtool --set-port eth0 type sfp active on ethtool --set-port eth0 type tp active on Maxime
> /* Loopback Direction (XXX is local/remote easier to understand?) */
> enum ethtool_loopback_dir {
> ETHTOOL_LB_DIR_NEAR_END = 0, /* Host -> Loop -> Host */
> ETHTOOL_LB_DIR_FAR_END, /* Line -> Loop -> Line */
> };
I like host->loop->host, it is much clearer than NEAR_END or
FAR_END. Where there is space, would use this description, even if it
is a bit verbose.
> struct ethtool_loopback_layer_cfg {
> enum ethtool_loopback_layer layer; /* ETHTOOL_LB_L_MAC, etc. */
> enum ethtool_loopback_dir direction; /* NEAR or FAR */
> u32 lane_mask; /* Specific lanes */
> u32 flags; /* patterns? reserved... */
> bool enabled;
> char name[16];
What would name be used for. I don't see it in your example. The nice
thing about netlink messages is that they are extendable, unlike
system calls. If there is no current use for a field, don't add it. It
can be added later when actually needed. So i would drop flags and
name.
Does CMIS, when used with a splitter cable, allow you to set loopback
on lanes? What is your use case for lane_mask?
> };
>
> struct ethtool_loopback_cfg {
> struct ethtool_loopback_layer_cfg *entries;
> u32 num_layers;
What is num_layers used for?
> };
>
> struct ethtool_ops {
> /* ... */
>
> /* Query which layers/lane-combos are physically possible */
> int (*get_loopback_caps)(struct net_device *dev,
> struct ethtool_loopback_cfg *caps);
>
> /* Get current active status for all layers */
> int (*get_loopback_state)(struct net_device *dev,
> struct ethtool_loopback_cfg *state);
>
> /* Set one or more layer/lane configurations atomically */
> int (*set_loopback)(struct net_device *dev,
> const struct ethtool_loopback_cfg *cfg,
> struct netlink_ext_ack *extack);
> };
>
> As for layers; EXT vs PMD? EXT could be a loopback plug, whereas PMD
> would be CMIS, or whatever the driver detects.
>
> Userland would be something like:
>
> # ethtool --show-loopback eth0
> Loopback Status for eth0:
> Layer: SW | State: OFF
> Layer: MAC | State: OFF
> Layer: PMA | State: ON | Lanes: 0x1 (Lane 0) | Direction: Near-End (Local)
> Layer: PMD | State: ON | Lanes: 0xF (All) | Direction: Far-End (Remote)
ETHTOOL_LINK_MODE_800000baseKR8_Full_BIT has 8 lanes, so 0xff would be
All in this case. Lanes adds quite a bit of complexity. Do we have a
real use case for it?
> Layer: EXT | State: ON | Detected: External Loopback Plug
>
> # ethtool --set-loopback <dev> [layer[:lanes][:direction]] ... [off]
>
> # # Simple MAC loopback:
> # ethtool --set-loopback eth0 mac (Defaults: lanes=all, dir=near)
> # # Specific SerDes (PMA) lane:
> # ethtool --set-loopback eth0 pma:lane0
> # # Complex multi-layer (PMA Near + PMD Far):
> # ethtool --set-loopback eth0 pma:0x1:near pmd:all:far
Is this something we actually want? Again it adds complexity,
especially in the error handling, when pma:0x1:near works, but
pmd:all:far fails, and you need to unwind the pma:0x1:near. Is there a
use case for atomically setting two loopbacks, rather than having the
user make two different calls?
> # # Disable all loopbacks:
> ethtool --set-loopback eth0 off
>
> Thoughts? Is this somewhat close to what you had in mind, Andrew?
I'm happy with the basic shape of this. I just needs the details
nailing down.
Andrew
Hey!
On Mon, 23 Feb 2026 at 15:30, Andrew Lunn <andrew@lunn.ch> wrote:
>
> > /* Loopback Direction (XXX is local/remote easier to understand?) */
> > enum ethtool_loopback_dir {
> > ETHTOOL_LB_DIR_NEAR_END = 0, /* Host -> Loop -> Host */
> > ETHTOOL_LB_DIR_FAR_END, /* Line -> Loop -> Line */
> > };
>
> I like host->loop->host, it is much clearer than NEAR_END or
> FAR_END. Where there is space, would use this description, even if it
> is a bit verbose.
Ok!
> > struct ethtool_loopback_layer_cfg {
> > enum ethtool_loopback_layer layer; /* ETHTOOL_LB_L_MAC, etc. */
> > enum ethtool_loopback_dir direction; /* NEAR or FAR */
> > u32 lane_mask; /* Specific lanes */
> > u32 flags; /* patterns? reserved... */
> > bool enabled;
> > char name[16];
>
> What would name be used for. I don't see it in your example. The nice
> thing about netlink messages is that they are extendable, unlike
> system calls. If there is no current use for a field, don't add it. It
> can be added later when actually needed. So i would drop flags and
> name.
Yeah, good call (I dropped flags on my local hack), name *and*
lane_mask. More on that below.
> Does CMIS, when used with a splitter cable, allow you to set loopback
> on lanes? What is your use case for lane_mask?
At least the spec exposed that it could be supported. Thinking more
about it, I think it can be added later if someone cares about it.
IOW, flags, name, and lane_mask shouldn't be part of the initial
patches.
> > };
> >
> > struct ethtool_loopback_cfg {
> > struct ethtool_loopback_layer_cfg *entries;
> > u32 num_layers;
>
> What is num_layers used for?
I'll change the _cfg to have a proper array instead. The idea is that
_cfg is a set of layers (MAC, PMA, etc.). The num_layers is the number
of entries. I'll make sure this is more obvious.
...
> > # ethtool --show-loopback eth0
> > Loopback Status for eth0:
> > Layer: SW | State: OFF
> > Layer: MAC | State: OFF
> > Layer: PMA | State: ON | Lanes: 0x1 (Lane 0) | Direction: Near-End (Local)
> > Layer: PMD | State: ON | Lanes: 0xF (All) | Direction: Far-End (Remote)
>
> ETHTOOL_LINK_MODE_800000baseKR8_Full_BIT has 8 lanes, so 0xff would be
> All in this case. Lanes adds quite a bit of complexity. Do we have a
> real use case for it?
I don't. As outlined above -- let's leave it out for now. If somebody
needs it in the future, we can have the discussion then.
> > Layer: EXT | State: ON | Detected: External Loopback Plug
> >
> > # ethtool --set-loopback <dev> [layer[:lanes][:direction]] ... [off]
> >
> > # # Simple MAC loopback:
> > # ethtool --set-loopback eth0 mac (Defaults: lanes=all, dir=near)
> > # # Specific SerDes (PMA) lane:
> > # ethtool --set-loopback eth0 pma:lane0
> > # # Complex multi-layer (PMA Near + PMD Far):
> > # ethtool --set-loopback eth0 pma:0x1:near pmd:all:far
>
> Is this something we actually want? Again it adds complexity,
> especially in the error handling, when pma:0x1:near works, but
> pmd:all:far fails, and you need to unwind the pma:0x1:near. Is there a
> use case for atomically setting two loopbacks, rather than having the
> user make two different calls?
Good point! Simpler is better.
> > # # Disable all loopbacks:
> > ethtool --set-loopback eth0 off
> >
> > Thoughts? Is this somewhat close to what you had in mind, Andrew?
>
> I'm happy with the basic shape of this. I just needs the details
> nailing down.
Ok! I'll roll something we can discuss more. Thanks for all input!
Björn
On Thu, 19 Feb 2026 at 14:01, Björn Töpel <bjorn@kernel.org> wrote: > > Hi! > > Background > ========== > > This series adds initial ethtool support for CMIS loopback. For the brave, there's userland support here [1], and use, e.g., as below: | [root@machine ~/ethtool]# ./ethtool --show-module eth0 | Module parameters for eth0: | loopback-capabilities: media-side-input, host-side-input | loopback-enabled: none | [root@machine ~/ethtool]# ./ethtool --set-module eth0 loopback-enabled host-side-input | netlink error: Netdevice is up, so setting loopback is not permitted | netlink error: Device or resource busy | [root@machine ~/ethtool]# ip link set dev eth0 down | [root@machine ~/ethtool]# ./ethtool --set-module eth0 loopback-enabled host-side-input | [root@machine ~/ethtool]# ./ethtool --show-module eth0 | Module parameters for eth0: | loopback-capabilities: media-side-input, host-side-input | loopback-enabled: host-side-input | [root@machine ~/ethtool]# ip link set dev eth0 up [1] https://github.com/fb-bjorn/ethtool/tree/module-loopback
© 2016 - 2026 Red Hat, Inc.