[PATCH mptcp-next v2 1/4] selftests: mptcp: lib: support flaky subtests

Matthieu Baerts (NGI0) posted 4 patches 5 months ago
There is a newer version of this series
[PATCH mptcp-next v2 1/4] selftests: mptcp: lib: support flaky subtests
Posted by Matthieu Baerts (NGI0) 5 months ago
Some subtests can be unstable, failing once every X runs. Fixing them
can take time: there could be an issue in the kernel or in the subtest,
and it is then important to do a proper analysis, not to hide real bugs.

To avoid creating noises on the different CIs, it is important to have a
simple way to mark subtests as flaky, and ignore the errors. This is
what this patch introduces: subtests can be marked as flaky by setting
MPTCP_LIB_SUBTEST_FLAKY env var to 1, e.g.

  MPTCP_LIB_SUBTEST_FLAKY=1 <run flaky subtest>

The subtest will be executed, and errors (if any) will be ignored. It is
still good to run these subtests, as it exercises code, and the results
can still be useful for the on-going investigations.

Note that the MPTCP CI will continue to track these flaky subtests, and
a ticket should be created before marking a subtest as flaky.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 tools/testing/selftests/net/mptcp/mptcp_lib.sh | 38 ++++++++++++++++++++------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
index ad2ebda5cb64..44febbc11131 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
@@ -21,6 +21,7 @@ declare -rx MPTCP_LIB_AF_INET6=10
 
 MPTCP_LIB_SUBTESTS=()
 MPTCP_LIB_SUBTESTS_DUPLICATED=0
+MPTCP_LIB_SUBTEST_FLAKY=0
 MPTCP_LIB_TEST_COUNTER=0
 MPTCP_LIB_TEST_FORMAT="%02u %-50s"
 MPTCP_LIB_IP_MPTCP=0
@@ -41,6 +42,18 @@ else
 	readonly MPTCP_LIB_COLOR_RESET=
 fi
 
+# SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var can be set when validating all
+# features using the last version of the kernel and the selftests to make sure
+# a test is not being skipped by mistake.
+mptcp_lib_expect_all_features() {
+	[ "${SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES:-}" = "1" ]
+}
+
+mptcp_lib_subtest_is_flaky() {
+	# If all features are expected, do not threat them as flaky
+	[ "${MPTCP_LIB_SUBTEST_FLAKY}" = 1 ] && ! mptcp_lib_expect_all_features
+}
+
 # $1: color, $2: text
 mptcp_lib_print_color() {
 	echo -e "${MPTCP_LIB_START_PRINT:-}${*}${MPTCP_LIB_COLOR_RESET}"
@@ -72,20 +85,21 @@ mptcp_lib_pr_skip() {
 }
 
 mptcp_lib_pr_fail() {
-	mptcp_lib_print_err "[FAIL]${1:+ ${*}}"
+	local title
+
+	if mptcp_lib_subtest_is_flaky; then
+		title="IGNO"
+	else
+		title="FAIL"
+	fi
+
+	mptcp_lib_print_err "[${title}]${1:+ ${*}}"
 }
 
 mptcp_lib_pr_info() {
 	mptcp_lib_print_info "INFO: ${*}"
 }
 
-# SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var can be set when validating all
-# features using the last version of the kernel and the selftests to make sure
-# a test is not being skipped by mistake.
-mptcp_lib_expect_all_features() {
-	[ "${SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES:-}" = "1" ]
-}
-
 # $1: msg
 mptcp_lib_fail_if_expected_feature() {
 	if mptcp_lib_expect_all_features; then
@@ -208,7 +222,13 @@ mptcp_lib_result_pass() {
 
 # $1: test name
 mptcp_lib_result_fail() {
-	__mptcp_lib_result_add "not ok" "${1}"
+	if mptcp_lib_subtest_is_flaky; then
+		# It might sound better to use 'not ok # TODO' or 'ok # SKIP',
+		# but some CIs don't understand 'TODO' and treat SKIP as errors.
+		__mptcp_lib_result_add "ok" "${1} # IGNORE"
+	else
+		__mptcp_lib_result_add "not ok" "${1}"
+	fi
 }
 
 # $1: test name

-- 
2.43.0
Re: [PATCH mptcp-next v2 1/4] selftests: mptcp: lib: support flaky subtests
Posted by Mat Martineau 4 months, 4 weeks ago
On Tue, 21 May 2024, Matthieu Baerts (NGI0) wrote:

> Some subtests can be unstable, failing once every X runs. Fixing them
> can take time: there could be an issue in the kernel or in the subtest,
> and it is then important to do a proper analysis, not to hide real bugs.
>
> To avoid creating noises on the different CIs, it is important to have a
> simple way to mark subtests as flaky, and ignore the errors. This is
> what this patch introduces: subtests can be marked as flaky by setting
> MPTCP_LIB_SUBTEST_FLAKY env var to 1, e.g.
>
>  MPTCP_LIB_SUBTEST_FLAKY=1 <run flaky subtest>
>
> The subtest will be executed, and errors (if any) will be ignored. It is
> still good to run these subtests, as it exercises code, and the results
> can still be useful for the on-going investigations.
>
> Note that the MPTCP CI will continue to track these flaky subtests, and
> a ticket should be created before marking a subtest as flaky.
>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
> tools/testing/selftests/net/mptcp/mptcp_lib.sh | 38 ++++++++++++++++++++------
> 1 file changed, 29 insertions(+), 9 deletions(-)
>
> diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
> index ad2ebda5cb64..44febbc11131 100644
> --- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh
> +++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
> @@ -21,6 +21,7 @@ declare -rx MPTCP_LIB_AF_INET6=10
>
> MPTCP_LIB_SUBTESTS=()
> MPTCP_LIB_SUBTESTS_DUPLICATED=0
> +MPTCP_LIB_SUBTEST_FLAKY=0
> MPTCP_LIB_TEST_COUNTER=0
> MPTCP_LIB_TEST_FORMAT="%02u %-50s"
> MPTCP_LIB_IP_MPTCP=0
> @@ -41,6 +42,18 @@ else
> 	readonly MPTCP_LIB_COLOR_RESET=
> fi
>
> +# SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var can be set when validating all
> +# features using the last version of the kernel and the selftests to make sure
> +# a test is not being skipped by mistake.
> +mptcp_lib_expect_all_features() {
> +	[ "${SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES:-}" = "1" ]
> +}

Hi Matthieu -

This is overloading SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES to also 
override any "flaky" flags, and treat those tests normally, correct?

I think it's preferable to add an environment varaible specific to this 
(like SELFTESTS_MPTCP_LIB_OVERRIDE_FLAKY), what do you think?

- Mat

> +
> +mptcp_lib_subtest_is_flaky() {
> +	# If all features are expected, do not threat them as flaky
> +	[ "${MPTCP_LIB_SUBTEST_FLAKY}" = 1 ] && ! mptcp_lib_expect_all_features
> +}
> +
> # $1: color, $2: text
> mptcp_lib_print_color() {
> 	echo -e "${MPTCP_LIB_START_PRINT:-}${*}${MPTCP_LIB_COLOR_RESET}"
> @@ -72,20 +85,21 @@ mptcp_lib_pr_skip() {
> }
>
> mptcp_lib_pr_fail() {
> -	mptcp_lib_print_err "[FAIL]${1:+ ${*}}"
> +	local title
> +
> +	if mptcp_lib_subtest_is_flaky; then
> +		title="IGNO"
> +	else
> +		title="FAIL"
> +	fi
> +
> +	mptcp_lib_print_err "[${title}]${1:+ ${*}}"
> }
>
> mptcp_lib_pr_info() {
> 	mptcp_lib_print_info "INFO: ${*}"
> }
>
> -# SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var can be set when validating all
> -# features using the last version of the kernel and the selftests to make sure
> -# a test is not being skipped by mistake.
> -mptcp_lib_expect_all_features() {
> -	[ "${SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES:-}" = "1" ]
> -}
> -
> # $1: msg
> mptcp_lib_fail_if_expected_feature() {
> 	if mptcp_lib_expect_all_features; then
> @@ -208,7 +222,13 @@ mptcp_lib_result_pass() {
>
> # $1: test name
> mptcp_lib_result_fail() {
> -	__mptcp_lib_result_add "not ok" "${1}"
> +	if mptcp_lib_subtest_is_flaky; then
> +		# It might sound better to use 'not ok # TODO' or 'ok # SKIP',
> +		# but some CIs don't understand 'TODO' and treat SKIP as errors.
> +		__mptcp_lib_result_add "ok" "${1} # IGNORE"
> +	else
> +		__mptcp_lib_result_add "not ok" "${1}"
> +	fi
> }
>
> # $1: test name
>
> -- 
> 2.43.0
>
>
>
Re: [PATCH mptcp-next v2 1/4] selftests: mptcp: lib: support flaky subtests
Posted by Matthieu Baerts 4 months, 4 weeks ago
Hi Mat,

Thank you for the review!

On 22/05/2024 02:27, Mat Martineau wrote:
> On Tue, 21 May 2024, Matthieu Baerts (NGI0) wrote:
> 
>> Some subtests can be unstable, failing once every X runs. Fixing them
>> can take time: there could be an issue in the kernel or in the subtest,
>> and it is then important to do a proper analysis, not to hide real bugs.
>>
>> To avoid creating noises on the different CIs, it is important to have a
>> simple way to mark subtests as flaky, and ignore the errors. This is
>> what this patch introduces: subtests can be marked as flaky by setting
>> MPTCP_LIB_SUBTEST_FLAKY env var to 1, e.g.
>>
>>  MPTCP_LIB_SUBTEST_FLAKY=1 <run flaky subtest>
>>
>> The subtest will be executed, and errors (if any) will be ignored. It is
>> still good to run these subtests, as it exercises code, and the results
>> can still be useful for the on-going investigations.
>>
>> Note that the MPTCP CI will continue to track these flaky subtests, and
>> a ticket should be created before marking a subtest as flaky.
>>
>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>> ---
>> tools/testing/selftests/net/mptcp/mptcp_lib.sh | 38 ++++++++++++++++++
>> ++------
>> 1 file changed, 29 insertions(+), 9 deletions(-)
>>
>> diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/
>> testing/selftests/net/mptcp/mptcp_lib.sh
>> index ad2ebda5cb64..44febbc11131 100644
>> --- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh
>> +++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
>> @@ -21,6 +21,7 @@ declare -rx MPTCP_LIB_AF_INET6=10
>>
>> MPTCP_LIB_SUBTESTS=()
>> MPTCP_LIB_SUBTESTS_DUPLICATED=0
>> +MPTCP_LIB_SUBTEST_FLAKY=0
>> MPTCP_LIB_TEST_COUNTER=0
>> MPTCP_LIB_TEST_FORMAT="%02u %-50s"
>> MPTCP_LIB_IP_MPTCP=0
>> @@ -41,6 +42,18 @@ else
>>     readonly MPTCP_LIB_COLOR_RESET=
>> fi
>>
>> +# SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var can be set when
>> validating all
>> +# features using the last version of the kernel and the selftests to
>> make sure
>> +# a test is not being skipped by mistake.
>> +mptcp_lib_expect_all_features() {
>> +    [ "${SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES:-}" = "1" ]
>> +}
> 
> Hi Matthieu -
> 
> This is overloading SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES to also
> override any "flaky" flags, and treat those tests normally, correct?

Correct.

> I think it's preferable to add an environment varaible specific to this
> (like SELFTESTS_MPTCP_LIB_OVERRIDE_FLAKY), what do you think?

Sure: I initially started to do that, then I thought "expecting all
features" also means "expecting all tests to work" somehow, but I agree
it is not clear. I will add a new dedicated env var in the v3 (and
target mptcp-net), and set it on the CI side as well.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.