[v1] selftests: drv-net: Fix issues in devlink_rate_tc_bw.py

[PATCH 3/3] selftests: drv-net: Relax total BW check in devlink_rate_tc_bw.py

Posted by Carolina Jubran 1 month ago

This test is meant to check TC bandwidth distribution, not the tx_max
of the VF. The total bandwidth check is only there to make sure that FW
tokens limit traffic, because the per-TC share is only meaningful when
the link is fully saturated.

Because the measured total is the sum of two iperf3 streams that do not
always start or stop at the same time, using a strict 1 Gbps target
caused random failures. This change adds a tolerance parameter to
BandwidthValidator, keeps per-TC checks tight at +-12%, and relaxes the
total bandwidth check to +-25% around 1 Gbps.

This avoids false failures while still confirming that the TC share
validation is meaningful.

Fixes: 23ca32e4ead4 ("selftests: drv-net: Add test for devlink-rate traffic class bandwidth distribution")
Tested-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
---
 .../selftests/drivers/net/hw/devlink_rate_tc_bw.py        | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/drivers/net/hw/devlink_rate_tc_bw.py b/tools/testing/selftests/drivers/net/hw/devlink_rate_tc_bw.py
index abc20bc4a34a..1713ca11f845 100755
--- a/tools/testing/selftests/drivers/net/hw/devlink_rate_tc_bw.py
+++ b/tools/testing/selftests/drivers/net/hw/devlink_rate_tc_bw.py
@@ -72,8 +72,8 @@ class BandwidthValidator:
     relative to the overall total.
     """
 
-    def __init__(self, shares):
-        self.tolerance_percent = 12
+    def __init__(self, shares, tolerance):
+        self.tolerance_percent = tolerance
         self.expected_total = sum(shares.values())
         self.bounds = {}
 
@@ -438,8 +438,8 @@ def main() -> None:
             raise KsftSkipEx("Could not get PCI address of the interface")
         cfg.require_cmd("iperf3", local=True, remote=True)
 
-        cfg.traffic_bw_validator = BandwidthValidator({"total": 1})
-        cfg.tc_bw_validator = BandwidthValidator({"tc3": 20, "tc4": 80})
+        cfg.traffic_bw_validator = BandwidthValidator({"total": 1}, 25)
+        cfg.tc_bw_validator = BandwidthValidator({"tc3": 20, "tc4": 80}, 12)
 
         cases = [test_no_tc_mapping_bandwidth, test_tc_mapping_bandwidth]
 
-- 
2.38.1

Re: [PATCH 3/3] selftests: drv-net: Relax total BW check in devlink_rate_tc_bw.py

Posted by Jakub Kicinski 1 month ago

On Sun, 31 Aug 2025 11:06:41 +0300 Carolina Jubran wrote:
> Because the measured total is the sum of two iperf3 streams that do not
> always start or stop at the same time

That's solvable, tho? iperf3 has --json support, it will give you 
the b/w readings in the configured intervals (1sec by default).
With the interval based samples at hand you should be able to select
only the period in which b/w is stable ("middle" of the test).

While at it it may make sense to switch to lib/py/load.py wrappers
rather than threading the python locally in the test.

Re: [PATCH 3/3] selftests: drv-net: Relax total BW check in devlink_rate_tc_bw.py

Posted by Jakub Kicinski 3 weeks, 6 days ago

On Tue, 2 Sep 2025 16:21:01 -0700 Jakub Kicinski wrote:
> On Sun, 31 Aug 2025 11:06:41 +0300 Carolina Jubran wrote:
> > Because the measured total is the sum of two iperf3 streams that do not
> > always start or stop at the same time  
> 
> That's solvable, tho? iperf3 has --json support, it will give you 
> the b/w readings in the configured intervals (1sec by default).
> With the interval based samples at hand you should be able to select
> only the period in which b/w is stable ("middle" of the test).
> 
> While at it it may make sense to switch to lib/py/load.py wrappers
> rather than threading the python locally in the test.

Hi Carolina! I think you replied to me but the reply never reached 
the list, I purged it from my inbox before realizing. 
I think you said that the direction of the flows is wrong for load.py.
Perhaps adding a reverse= attr which will translate the --reverse in
the client process would do?

Re: [PATCH 3/3] selftests: drv-net: Relax total BW check in devlink_rate_tc_bw.py

Posted by Carolina Jubran 3 weeks, 3 days ago

On 06/09/2025 1:32, Jakub Kicinski wrote:
> On Tue, 2 Sep 2025 16:21:01 -0700 Jakub Kicinski wrote:
>> On Sun, 31 Aug 2025 11:06:41 +0300 Carolina Jubran wrote:
>>> Because the measured total is the sum of two iperf3 streams that do not
>>> always start or stop at the same time
>> That's solvable, tho? iperf3 has --json support, it will give you
>> the b/w readings in the configured intervals (1sec by default).
>> With the interval based samples at hand you should be able to select
>> only the period in which b/w is stable ("middle" of the test).
>>
>> While at it it may make sense to switch to lib/py/load.py wrappers
>> rather than threading the python locally in the test.
> Hi Carolina! I think you replied to me but the reply never reached
> the list, I purged it from my inbox before realizing.
:O Sorry about that
> I think you said that the direction of the flows is wrong for load.py.
Yes, that’s exactly what I said.
> Perhaps adding a reverse= attr which will translate the --reverse in
> the client process would do?

However, I’ll also need to extend load.py:

1. Binding support to ensure traffic flows through the specific VLAN
     interface.
2. Interval-based measurement for iperf3 --json to analyze only the
     stable period.

So my plan is:

1. Send v2 for net to fix the current test with interval-based
     measurement.
2. Follow up with a patch to extend load.py with reverse/binding/interval
     support and then migrate the test to use it.

Does that sound good to you?

Thanks again for the suggestion :)

Re: [PATCH 3/3] selftests: drv-net: Relax total BW check in devlink_rate_tc_bw.py

Posted by Jakub Kicinski 3 weeks, 3 days ago

On Mon, 8 Sep 2025 22:16:29 +0300 Carolina Jubran wrote:
> However, I’ll also need to extend load.py:
> 
> 1. Binding support to ensure traffic flows through the specific VLAN
>      interface.
> 2. Interval-based measurement for iperf3 --json to analyze only the
>      stable period.
> 
> So my plan is:
> 
> 1. Send v2 for net to fix the current test with interval-based
>      measurement.
> 2. Follow up with a patch to extend load.py with reverse/binding/interval
>      support and then migrate the test to use it.
> 
> Does that sound good to you?

Sounds too complicated, this is just a stability improvement for a test
which works on single device, and is not exercised / reported upstream.
Let's jump straight to step 2.

Re: [PATCH 3/3] selftests: drv-net: Relax total BW check in devlink_rate_tc_bw.py

Posted by Carolina Jubran 3 weeks, 3 days ago

On 08/09/2025 23:19, Jakub Kicinski wrote:
> On Mon, 8 Sep 2025 22:16:29 +0300 Carolina Jubran wrote:
>> However, I’ll also need to extend load.py:
>>
>> 1. Binding support to ensure traffic flows through the specific VLAN
>>       interface.
>> 2. Interval-based measurement for iperf3 --json to analyze only the
>>       stable period.
>>
>> So my plan is:
>>
>> 1. Send v2 for net to fix the current test with interval-based
>>       measurement.
>> 2. Follow up with a patch to extend load.py with reverse/binding/interval
>>       support and then migrate the test to use it.
>>
>> Does that sound good to you?
> Sounds too complicated, this is just a stability improvement for a test
> which works on single device, and is not exercised / reported upstream.
> Let's jump straight to step 2.


Ack, I’ll drop this patch from the series and handle the migration to
load.py and reliability improvements in a follow-up.

Re: [PATCH 3/3] selftests: drv-net: Relax total BW check in devlink_rate_tc_bw.py

Posted by Jakub Kicinski 3 weeks, 2 days ago

On Tue, 9 Sep 2025 13:06:22 +0300 Carolina Jubran wrote:
> > Sounds too complicated, this is just a stability improvement for a test
> > which works on single device, and is not exercised / reported upstream.
> > Let's jump straight to step 2.  
> 
> Ack, I’ll drop this patch from the series and handle the migration to
> load.py and reliability improvements in a follow-up.

I think you misunderstood me, let's redo the whole series against
net-next. No need for Fixes tags.

Re: [PATCH 3/3] selftests: drv-net: Relax total BW check in devlink_rate_tc_bw.py

Posted by Simon Horman 1 month ago

On Sun, Aug 31, 2025 at 11:06:41AM +0300, Carolina Jubran wrote:
> This test is meant to check TC bandwidth distribution, not the tx_max
> of the VF. The total bandwidth check is only there to make sure that FW
> tokens limit traffic, because the per-TC share is only meaningful when
> the link is fully saturated.
> 
> Because the measured total is the sum of two iperf3 streams that do not
> always start or stop at the same time, using a strict 1 Gbps target
> caused random failures. This change adds a tolerance parameter to
> BandwidthValidator, keeps per-TC checks tight at +-12%, and relaxes the
> total bandwidth check to +-25% around 1 Gbps.
> 
> This avoids false failures while still confirming that the TC share
> validation is meaningful.
> 
> Fixes: 23ca32e4ead4 ("selftests: drv-net: Add test for devlink-rate traffic class bandwidth distribution")
> Tested-by: Carolina Jubran <cjubran@nvidia.com>
> Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Nimrod Oren <noren@nvidia.com>

Reviewed-by: Simon Horman <horms@kernel.org>