This test is meant to check TC bandwidth distribution, not the tx_max
of the VF. The total bandwidth check is only there to make sure that FW
tokens limit traffic, because the per-TC share is only meaningful when
the link is fully saturated.
Because the measured total is the sum of two iperf3 streams that do not
always start or stop at the same time, using a strict 1 Gbps target
caused random failures. This change adds a tolerance parameter to
BandwidthValidator, keeps per-TC checks tight at +-12%, and relaxes the
total bandwidth check to +-25% around 1 Gbps.
This avoids false failures while still confirming that the TC share
validation is meaningful.
Fixes: 23ca32e4ead4 ("selftests: drv-net: Add test for devlink-rate traffic class bandwidth distribution")
Tested-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
---
.../selftests/drivers/net/hw/devlink_rate_tc_bw.py | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/hw/devlink_rate_tc_bw.py b/tools/testing/selftests/drivers/net/hw/devlink_rate_tc_bw.py
index abc20bc4a34a..1713ca11f845 100755
--- a/tools/testing/selftests/drivers/net/hw/devlink_rate_tc_bw.py
+++ b/tools/testing/selftests/drivers/net/hw/devlink_rate_tc_bw.py
@@ -72,8 +72,8 @@ class BandwidthValidator:
relative to the overall total.
"""
- def __init__(self, shares):
- self.tolerance_percent = 12
+ def __init__(self, shares, tolerance):
+ self.tolerance_percent = tolerance
self.expected_total = sum(shares.values())
self.bounds = {}
@@ -438,8 +438,8 @@ def main() -> None:
raise KsftSkipEx("Could not get PCI address of the interface")
cfg.require_cmd("iperf3", local=True, remote=True)
- cfg.traffic_bw_validator = BandwidthValidator({"total": 1})
- cfg.tc_bw_validator = BandwidthValidator({"tc3": 20, "tc4": 80})
+ cfg.traffic_bw_validator = BandwidthValidator({"total": 1}, 25)
+ cfg.tc_bw_validator = BandwidthValidator({"tc3": 20, "tc4": 80}, 12)
cases = [test_no_tc_mapping_bandwidth, test_tc_mapping_bandwidth]
--
2.38.1
On Sun, 31 Aug 2025 11:06:41 +0300 Carolina Jubran wrote: > Because the measured total is the sum of two iperf3 streams that do not > always start or stop at the same time That's solvable, tho? iperf3 has --json support, it will give you the b/w readings in the configured intervals (1sec by default). With the interval based samples at hand you should be able to select only the period in which b/w is stable ("middle" of the test). While at it it may make sense to switch to lib/py/load.py wrappers rather than threading the python locally in the test.
On Tue, 2 Sep 2025 16:21:01 -0700 Jakub Kicinski wrote: > On Sun, 31 Aug 2025 11:06:41 +0300 Carolina Jubran wrote: > > Because the measured total is the sum of two iperf3 streams that do not > > always start or stop at the same time > > That's solvable, tho? iperf3 has --json support, it will give you > the b/w readings in the configured intervals (1sec by default). > With the interval based samples at hand you should be able to select > only the period in which b/w is stable ("middle" of the test). > > While at it it may make sense to switch to lib/py/load.py wrappers > rather than threading the python locally in the test. Hi Carolina! I think you replied to me but the reply never reached the list, I purged it from my inbox before realizing. I think you said that the direction of the flows is wrong for load.py. Perhaps adding a reverse= attr which will translate the --reverse in the client process would do?
On 06/09/2025 1:32, Jakub Kicinski wrote: > On Tue, 2 Sep 2025 16:21:01 -0700 Jakub Kicinski wrote: >> On Sun, 31 Aug 2025 11:06:41 +0300 Carolina Jubran wrote: >>> Because the measured total is the sum of two iperf3 streams that do not >>> always start or stop at the same time >> That's solvable, tho? iperf3 has --json support, it will give you >> the b/w readings in the configured intervals (1sec by default). >> With the interval based samples at hand you should be able to select >> only the period in which b/w is stable ("middle" of the test). >> >> While at it it may make sense to switch to lib/py/load.py wrappers >> rather than threading the python locally in the test. > Hi Carolina! I think you replied to me but the reply never reached > the list, I purged it from my inbox before realizing. :O Sorry about that > I think you said that the direction of the flows is wrong for load.py. Yes, that’s exactly what I said. > Perhaps adding a reverse= attr which will translate the --reverse in > the client process would do? However, I’ll also need to extend load.py: 1. Binding support to ensure traffic flows through the specific VLAN interface. 2. Interval-based measurement for iperf3 --json to analyze only the stable period. So my plan is: 1. Send v2 for net to fix the current test with interval-based measurement. 2. Follow up with a patch to extend load.py with reverse/binding/interval support and then migrate the test to use it. Does that sound good to you? Thanks again for the suggestion :)
On Mon, 8 Sep 2025 22:16:29 +0300 Carolina Jubran wrote: > However, I’ll also need to extend load.py: > > 1. Binding support to ensure traffic flows through the specific VLAN > interface. > 2. Interval-based measurement for iperf3 --json to analyze only the > stable period. > > So my plan is: > > 1. Send v2 for net to fix the current test with interval-based > measurement. > 2. Follow up with a patch to extend load.py with reverse/binding/interval > support and then migrate the test to use it. > > Does that sound good to you? Sounds too complicated, this is just a stability improvement for a test which works on single device, and is not exercised / reported upstream. Let's jump straight to step 2.
On 08/09/2025 23:19, Jakub Kicinski wrote: > On Mon, 8 Sep 2025 22:16:29 +0300 Carolina Jubran wrote: >> However, I’ll also need to extend load.py: >> >> 1. Binding support to ensure traffic flows through the specific VLAN >> interface. >> 2. Interval-based measurement for iperf3 --json to analyze only the >> stable period. >> >> So my plan is: >> >> 1. Send v2 for net to fix the current test with interval-based >> measurement. >> 2. Follow up with a patch to extend load.py with reverse/binding/interval >> support and then migrate the test to use it. >> >> Does that sound good to you? > Sounds too complicated, this is just a stability improvement for a test > which works on single device, and is not exercised / reported upstream. > Let's jump straight to step 2. Ack, I’ll drop this patch from the series and handle the migration to load.py and reliability improvements in a follow-up.
On Tue, 9 Sep 2025 13:06:22 +0300 Carolina Jubran wrote: > > Sounds too complicated, this is just a stability improvement for a test > > which works on single device, and is not exercised / reported upstream. > > Let's jump straight to step 2. > > Ack, I’ll drop this patch from the series and handle the migration to > load.py and reliability improvements in a follow-up. I think you misunderstood me, let's redo the whole series against net-next. No need for Fixes tags.
On Sun, Aug 31, 2025 at 11:06:41AM +0300, Carolina Jubran wrote: > This test is meant to check TC bandwidth distribution, not the tx_max > of the VF. The total bandwidth check is only there to make sure that FW > tokens limit traffic, because the per-TC share is only meaningful when > the link is fully saturated. > > Because the measured total is the sum of two iperf3 streams that do not > always start or stop at the same time, using a strict 1 Gbps target > caused random failures. This change adds a tolerance parameter to > BandwidthValidator, keeps per-TC checks tight at +-12%, and relaxes the > total bandwidth check to +-25% around 1 Gbps. > > This avoids false failures while still confirming that the TC share > validation is meaningful. > > Fixes: 23ca32e4ead4 ("selftests: drv-net: Add test for devlink-rate traffic class bandwidth distribution") > Tested-by: Carolina Jubran <cjubran@nvidia.com> > Signed-off-by: Carolina Jubran <cjubran@nvidia.com> > Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> > Reviewed-by: Nimrod Oren <noren@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org>
© 2016 - 2025 Red Hat, Inc.