drivers/net/bonding/bond_3ad.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
The timer which ensures that no more than 3 LACPDUs are transmitted in
a second rearms itself every 333ms regardless of whether an LACPDU is
transmitted when the timer expires. This causes LACPDU tx to be delayed
until the next expiration of the timer, which effectively aligns LACPDUs
to ~333ms boundaries. This results in a variable amount of jitter in the
timing of periodic LACPDUs.
Change this to only rearm the timer when an LACPDU is actually sent,
allowing tx at any point after the timer has expired.
Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
---
drivers/net/bonding/bond_3ad.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index c6807e473ab706afed9560bcdb5e6eca1934f5b7..a8d8aaa169fc09d7d5c201ff298b37b3f11a7ded 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -1378,7 +1378,7 @@ static void ad_tx_machine(struct port *port)
/* check if tx timer expired, to verify that we do not send more than
* 3 packets per second
*/
- if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) {
+ if (!port->sm_tx_timer_counter || !(--port->sm_tx_timer_counter)) {
/* check if there is something to send */
if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) {
__update_lacpdu_from_port(port);
@@ -1393,12 +1393,13 @@ static void ad_tx_machine(struct port *port)
* again until demanded
*/
port->ntt = false;
+
+ /* restart tx timer(to verify that we will not
+ * exceed AD_MAX_TX_IN_SECOND
+ */
+ port->sm_tx_timer_counter = ad_ticks_per_sec / AD_MAX_TX_IN_SECOND;
}
}
- /* restart tx timer(to verify that we will not exceed
- * AD_MAX_TX_IN_SECOND
- */
- port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND;
}
}
---
base-commit: 86731a2a651e58953fc949573895f2fa6d456841
change-id: 20250625-fix-lacpdu-jitter-1554d9f600ab
Best regards,
--
Seth Forshee (DigitalOcean) <sforshee@kernel.org>
Seth Forshee (DigitalOcean) <sforshee@kernel.org> wrote: >The timer which ensures that no more than 3 LACPDUs are transmitted in >a second rearms itself every 333ms regardless of whether an LACPDU is >transmitted when the timer expires. This causes LACPDU tx to be delayed >until the next expiration of the timer, which effectively aligns LACPDUs >to ~333ms boundaries. This results in a variable amount of jitter in the >timing of periodic LACPDUs. To be clear, the "3 per second" limitation that all of this should to conform to is from IEEE 802.1AX-2014, 6.4.16 Transmit machine: "When the LACP_Enabled variable is TRUE and the NTT (6.4.7) variable is TRUE, the Transmit machine shall ensure that a properly formatted LACPDU (6.4.2) is transmitted [i.e., issue a CtrlMuxN:M_UNITDATA.Request(LACPDU) service primitive], subject to the restriction that no more than three LACPDUs may be transmitted in any Fast_Periodic_Time interval. If NTT is set to TRUE when this limit is in force, the transmission shall be delayed until such a time as the restriction is no longer in force. The NTT variable shall be set to FALSE when the Transmit machine has transmitted a LACPDU." The current implementation conforms to this as you describe: by aligning transmission to 1/3 second boundaries, no more than 3 can ever be sent in one second. If, hypothetically, the state machine were to transition, or a user updates port settings (either of which would set NTT each time) more than 3 times in a second, would your patched code obey this restriction? For completeness, and to make this email as complicated as possible, I'll note that 802.1AX-2020 removes this particular restriction in favor of incorporating the 802.3 generic limit on transmission rates for Slow Protocols (of which LACP is one) to 10 per second (802.3-2022, 30.3.1.1.38) into the state machine (802.1AX-2020, 6.4.7, see "txOpportunity" and 6.4.14 LACP Transmit machine). Linux bonding doesn't implement the 802.1AX-2020 state machines, though, so I don't think we can reasonably pick and choose arbitrary pieces from two differing editions of a standard. -J >Change this to only rearm the timer when an LACPDU is actually sent, >allowing tx at any point after the timer has expired. > >Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> >--- > drivers/net/bonding/bond_3ad.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c >index c6807e473ab706afed9560bcdb5e6eca1934f5b7..a8d8aaa169fc09d7d5c201ff298b37b3f11a7ded 100644 >--- a/drivers/net/bonding/bond_3ad.c >+++ b/drivers/net/bonding/bond_3ad.c >@@ -1378,7 +1378,7 @@ static void ad_tx_machine(struct port *port) > /* check if tx timer expired, to verify that we do not send more than > * 3 packets per second > */ >- if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) { >+ if (!port->sm_tx_timer_counter || !(--port->sm_tx_timer_counter)) { > /* check if there is something to send */ > if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) { > __update_lacpdu_from_port(port); >@@ -1393,12 +1393,13 @@ static void ad_tx_machine(struct port *port) > * again until demanded > */ > port->ntt = false; >+ >+ /* restart tx timer(to verify that we will not >+ * exceed AD_MAX_TX_IN_SECOND >+ */ >+ port->sm_tx_timer_counter = ad_ticks_per_sec / AD_MAX_TX_IN_SECOND; > } > } >- /* restart tx timer(to verify that we will not exceed >- * AD_MAX_TX_IN_SECOND >- */ >- port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND; > } > } > > >--- >base-commit: 86731a2a651e58953fc949573895f2fa6d456841 >change-id: 20250625-fix-lacpdu-jitter-1554d9f600ab > >Best regards, >-- >Seth Forshee (DigitalOcean) <sforshee@kernel.org> --- -Jay Vosburgh, jv@jvosburgh.net
On Wed, Jun 25, 2025 at 09:30:56AM -0700, Jay Vosburgh wrote: > Seth Forshee (DigitalOcean) <sforshee@kernel.org> wrote: > > >The timer which ensures that no more than 3 LACPDUs are transmitted in > >a second rearms itself every 333ms regardless of whether an LACPDU is > >transmitted when the timer expires. This causes LACPDU tx to be delayed > >until the next expiration of the timer, which effectively aligns LACPDUs > >to ~333ms boundaries. This results in a variable amount of jitter in the > >timing of periodic LACPDUs. > > To be clear, the "3 per second" limitation that all of this > should to conform to is from IEEE 802.1AX-2014, 6.4.16 Transmit machine: > > "When the LACP_Enabled variable is TRUE and the NTT (6.4.7) > variable is TRUE, the Transmit machine shall ensure that a > properly formatted LACPDU (6.4.2) is transmitted [i.e., issue a > CtrlMuxN:M_UNITDATA.Request(LACPDU) service primitive], subject > to the restriction that no more than three LACPDUs may be > transmitted in any Fast_Periodic_Time interval. If NTT is set to > TRUE when this limit is in force, the transmission shall be > delayed until such a time as the restriction is no longer in > force. The NTT variable shall be set to FALSE when the Transmit > machine has transmitted a LACPDU." > > The current implementation conforms to this as you describe: by > aligning transmission to 1/3 second boundaries, no more than 3 can ever > be sent in one second. > > If, hypothetically, the state machine were to transition, or a > user updates port settings (either of which would set NTT each time) > more than 3 times in a second, would your patched code obey this > restriction? As long as the transition doesn't reset sm_tx_timer_counter to something smaller than ad_ticks_per_sec/AD_MAX_TX_IN_SECOND, which nothing does currently (and if it did it would be at risk of sending more than 3 in a second already). The timer is reset on each tx, so no two consecutive LACPDUs can be sent less than 300ms apart, therefore no more than 3 can be per second. If a state machine transition sets NTT within 300ms of the previous tx, it will not send another until the timer expires. > For completeness, and to make this email as complicated as > possible, I'll note that 802.1AX-2020 removes this particular > restriction in favor of incorporating the 802.3 generic limit on > transmission rates for Slow Protocols (of which LACP is one) to 10 per > second (802.3-2022, 30.3.1.1.38) into the state machine (802.1AX-2020, > 6.4.7, see "txOpportunity" and 6.4.14 LACP Transmit machine). Linux > bonding doesn't implement the 802.1AX-2020 state machines, though, so I > don't think we can reasonably pick and choose arbitrary pieces from two > differing editions of a standard. > > -J > > >Change this to only rearm the timer when an LACPDU is actually sent, > >allowing tx at any point after the timer has expired. > > > >Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> > >--- > > drivers/net/bonding/bond_3ad.c | 11 ++++++----- > > 1 file changed, 6 insertions(+), 5 deletions(-) > > > >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c > >index c6807e473ab706afed9560bcdb5e6eca1934f5b7..a8d8aaa169fc09d7d5c201ff298b37b3f11a7ded 100644 > >--- a/drivers/net/bonding/bond_3ad.c > >+++ b/drivers/net/bonding/bond_3ad.c > >@@ -1378,7 +1378,7 @@ static void ad_tx_machine(struct port *port) > > /* check if tx timer expired, to verify that we do not send more than > > * 3 packets per second > > */ > >- if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) { > >+ if (!port->sm_tx_timer_counter || !(--port->sm_tx_timer_counter)) { > > /* check if there is something to send */ > > if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) { > > __update_lacpdu_from_port(port); > >@@ -1393,12 +1393,13 @@ static void ad_tx_machine(struct port *port) > > * again until demanded > > */ > > port->ntt = false; > >+ > >+ /* restart tx timer(to verify that we will not > >+ * exceed AD_MAX_TX_IN_SECOND > >+ */ > >+ port->sm_tx_timer_counter = ad_ticks_per_sec / AD_MAX_TX_IN_SECOND; > > } > > } > >- /* restart tx timer(to verify that we will not exceed > >- * AD_MAX_TX_IN_SECOND > >- */ > >- port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND; > > } > > } > > > > > >--- > >base-commit: 86731a2a651e58953fc949573895f2fa6d456841 > >change-id: 20250625-fix-lacpdu-jitter-1554d9f600ab > > > >Best regards, > >-- > >Seth Forshee (DigitalOcean) <sforshee@kernel.org> > > --- > -Jay Vosburgh, jv@jvosburgh.net
On 6/25/25 7:00 PM, Seth Forshee (DigitalOcean) wrote: > On Wed, Jun 25, 2025 at 09:30:56AM -0700, Jay Vosburgh wrote: >> Seth Forshee (DigitalOcean) <sforshee@kernel.org> wrote: >> >>> The timer which ensures that no more than 3 LACPDUs are transmitted in >>> a second rearms itself every 333ms regardless of whether an LACPDU is >>> transmitted when the timer expires. This causes LACPDU tx to be delayed >>> until the next expiration of the timer, which effectively aligns LACPDUs >>> to ~333ms boundaries. This results in a variable amount of jitter in the >>> timing of periodic LACPDUs. >> >> To be clear, the "3 per second" limitation that all of this >> should to conform to is from IEEE 802.1AX-2014, 6.4.16 Transmit machine: >> >> "When the LACP_Enabled variable is TRUE and the NTT (6.4.7) >> variable is TRUE, the Transmit machine shall ensure that a >> properly formatted LACPDU (6.4.2) is transmitted [i.e., issue a >> CtrlMuxN:M_UNITDATA.Request(LACPDU) service primitive], subject >> to the restriction that no more than three LACPDUs may be >> transmitted in any Fast_Periodic_Time interval. If NTT is set to >> TRUE when this limit is in force, the transmission shall be >> delayed until such a time as the restriction is no longer in >> force. The NTT variable shall be set to FALSE when the Transmit >> machine has transmitted a LACPDU." >> >> The current implementation conforms to this as you describe: by >> aligning transmission to 1/3 second boundaries, no more than 3 can ever >> be sent in one second. >> >> If, hypothetically, the state machine were to transition, or a >> user updates port settings (either of which would set NTT each time) >> more than 3 times in a second, would your patched code obey this >> restriction? > > As long as the transition doesn't reset sm_tx_timer_counter to something > smaller than ad_ticks_per_sec/AD_MAX_TX_IN_SECOND, which nothing does > currently (and if it did it would be at risk of sending more than 3 in a > second already). The timer is reset on each tx, so no two consecutive > LACPDUs can be sent less than 300ms apart, therefore no more than 3 can > be per second. If a state machine transition sets NTT within 300ms of > the previous tx, it will not send another until the timer expires. @Jay, I believe the above statement is correct. What I'm missing? Side note: I'm wondering if this should be considered a fix, and thus requiring targeting the 'net' tree and a 'fixes' tag. /P
Hello, On 6/25/25 11:01, Seth Forshee (DigitalOcean) wrote: > The timer which ensures that no more than 3 LACPDUs are transmitted in > a second rearms itself every 333ms regardless of whether an LACPDU is > transmitted when the timer expires. This causes LACPDU tx to be delayed > until the next expiration of the timer, which effectively aligns LACPDUs > to ~333ms boundaries. This results in a variable amount of jitter in the > timing of periodic LACPDUs. > > Change this to only rearm the timer when an LACPDU is actually sent, > allowing tx at any point after the timer has expired. > > Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> > --- > drivers/net/bonding/bond_3ad.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c > index c6807e473ab706afed9560bcdb5e6eca1934f5b7..a8d8aaa169fc09d7d5c201ff298b37b3f11a7ded 100644 > --- a/drivers/net/bonding/bond_3ad.c > +++ b/drivers/net/bonding/bond_3ad.c > @@ -1378,7 +1378,7 @@ static void ad_tx_machine(struct port *port) > /* check if tx timer expired, to verify that we do not send more than > * 3 packets per second > */ > - if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) { > + if (!port->sm_tx_timer_counter || !(--port->sm_tx_timer_counter)) { > /* check if there is something to send */ > if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) { > __update_lacpdu_from_port(port); > @@ -1393,12 +1393,13 @@ static void ad_tx_machine(struct port *port) > * again until demanded > */ > port->ntt = false; > + > + /* restart tx timer(to verify that we will not > + * exceed AD_MAX_TX_IN_SECOND > + */ > + port->sm_tx_timer_counter = ad_ticks_per_sec / AD_MAX_TX_IN_SECOND; > } > } > - /* restart tx timer(to verify that we will not exceed > - * AD_MAX_TX_IN_SECOND > - */ > - port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND; > } > } Reviewed-by: Carlos Bilbao <carlos.bilbao@kernel.org> > > > --- > base-commit: 86731a2a651e58953fc949573895f2fa6d456841 > change-id: 20250625-fix-lacpdu-jitter-1554d9f600ab > > Best regards, Thanks, Carlos
© 2016 - 2025 Red Hat, Inc.