selftests/run_kselftest.sh: Add --error-on-fail flag

[PATCH] selftests/run_kselftest.sh: Add --error-on-fail flag

Posted by Brendan Jackman 4 months ago

Parsing KTAP is quite an inconvenience, but most of the time the thing
you really want to know is "did anything fail"?

Let's give the user the ability to get this information without needing
to parse anything.

Because of the use of subshells and namespaces, this needs to be
communicated via a file. Just write arbitrary data into the file and
treat non-emppty content as a signal that something failed.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 tools/testing/selftests/kselftest/runner.sh | 14 ++++++++++----
 tools/testing/selftests/run_kselftest.sh    | 14 ++++++++++++++
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/kselftest/runner.sh b/tools/testing/selftests/kselftest/runner.sh
index 2c3c58e65a419f5ee8d7dc51a37671237a07fa0b..fd1e0f9b1cef48c5df1afaaedc0c97bee1c12dc5 100644
--- a/tools/testing/selftests/kselftest/runner.sh
+++ b/tools/testing/selftests/kselftest/runner.sh
@@ -44,6 +44,12 @@ tap_timeout()
 	fi
 }
 
+report_failure()
+{
+	echo "not ok $*" >> "$kselftest_failures_file"
+	echo "$*" >> "$kselftest_failures_file"
+}
+
 run_one()
 {
 	DIR="$1"
@@ -105,7 +111,7 @@ run_one()
 	echo "# $TEST_HDR_MSG"
 	if [ ! -e "$TEST" ]; then
 		echo "# Warning: file $TEST is missing!"
-		echo "not ok $test_num $TEST_HDR_MSG"
+		report_failure "$test_num $TEST_HDR_MSG"
 	else
 		if [ -x /usr/bin/stdbuf ]; then
 			stdbuf="/usr/bin/stdbuf --output=L "
@@ -123,7 +129,7 @@ run_one()
 				interpreter=$(head -n 1 "$TEST" | cut -c 3-)
 				cmd="$stdbuf $interpreter ./$BASENAME_TEST"
 			else
-				echo "not ok $test_num $TEST_HDR_MSG"
+				report_failure "$test_num $TEST_HDR_MSG"
 				return
 			fi
 		fi
@@ -137,9 +143,9 @@ run_one()
 			echo "ok $test_num $TEST_HDR_MSG # SKIP"
 		elif [ $rc -eq $timeout_rc ]; then \
 			echo "#"
-			echo "not ok $test_num $TEST_HDR_MSG # TIMEOUT $kselftest_timeout seconds"
+			report_failure "$test_num $TEST_HDR_MSG # TIMEOUT $kselftest_timeout seconds"
 		else
-			echo "not ok $test_num $TEST_HDR_MSG # exit=$rc"
+			report_failure "$test_num $TEST_HDR_MSG # exit=$rc"
 		fi)
 		cd - >/dev/null
 	fi
diff --git a/tools/testing/selftests/run_kselftest.sh b/tools/testing/selftests/run_kselftest.sh
index 0443beacf3621ae36cb12ffd57f696ddef3526b5..c345f38ad424029bfe50d19b26bdd1d4bda29316 100755
--- a/tools/testing/selftests/run_kselftest.sh
+++ b/tools/testing/selftests/run_kselftest.sh
@@ -36,6 +36,7 @@ Usage: $0 [OPTIONS]
   -n | --netns			Run each test in namespace
   -h | --help			Show this usage info
   -o | --override-timeout	Number of seconds after which we timeout
+  -e | --error-on-fail	After finishing all tests, exit with code 1 if any failed.
 EOF
 	exit $1
 }
@@ -44,6 +45,7 @@ COLLECTIONS=""
 TESTS=""
 dryrun=""
 kselftest_override_timeout=""
+ERROR_ON_FAIL=false
 while true; do
 	case "$1" in
 		-s | --summary)
@@ -71,6 +73,9 @@ while true; do
 		-o | --override-timeout)
 			kselftest_override_timeout="$2"
 			shift 2 ;;
+		-e | --error-on-fail)
+			ERROR_ON_FAIL="true"
+			shift ;;
 		-h | --help)
 			usage 0 ;;
 		"")
@@ -105,9 +110,18 @@ if [ -n "$TESTS" ]; then
 	available="$(echo "$valid" | sed -e 's/ /\n/g')"
 fi
 
+kselftest_failures_file=$(mktemp --tmpdir kselftest-failures-XXXXXX)
+export kselftest_failures_file
+
 collections=$(echo "$available" | cut -d: -f1 | sort | uniq)
 for collection in $collections ; do
 	[ -w /dev/kmsg ] && echo "kselftest: Running tests in $collection" >> /dev/kmsg
 	tests=$(echo "$available" | grep "^$collection:" | cut -d: -f2)
 	($dryrun cd "$collection" && $dryrun run_many $tests)
 done
+
+failures="$(cat "$kselftest_failures_file")"
+rm "$kselftest_failures_file"
+if "$ERROR_ON_FAIL" && [ "$failures" ]; then
+	exit 1
+fi

---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20251007-b4-ksft-error-on-fail-0c2cb3246041

Best regards,
-- 
Brendan Jackman <jackmanb@google.com>

Re: [PATCH] selftests/run_kselftest.sh: Add --error-on-fail flag

Posted by Thomas Weißschuh 4 months ago

On Tue, Oct 07, 2025 at 11:06:46AM +0000, Brendan Jackman wrote:
> Parsing KTAP is quite an inconvenience, but most of the time the thing
> you really want to know is "did anything fail"?
> 
> Let's give the user the ability to get this information without needing
> to parse anything.
> 
> Because of the use of subshells and namespaces, this needs to be
> communicated via a file. Just write arbitrary data into the file and
> treat non-emppty content as a signal that something failed.
> 
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
>  tools/testing/selftests/kselftest/runner.sh | 14 ++++++++++----
>  tools/testing/selftests/run_kselftest.sh    | 14 ++++++++++++++
>  2 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/kselftest/runner.sh b/tools/testing/selftests/kselftest/runner.sh
> index 2c3c58e65a419f5ee8d7dc51a37671237a07fa0b..fd1e0f9b1cef48c5df1afaaedc0c97bee1c12dc5 100644
> --- a/tools/testing/selftests/kselftest/runner.sh
> +++ b/tools/testing/selftests/kselftest/runner.sh
> @@ -44,6 +44,12 @@ tap_timeout()
>  	fi
>  }
>  
> +report_failure()
> +{
> +	echo "not ok $*" >> "$kselftest_failures_file"
> +	echo "$*" >> "$kselftest_failures_file"

Both of these go into the failures file. The first should go to stdout, no?

> +}
> +
>  run_one()
>  {
>  	DIR="$1"
> @@ -105,7 +111,7 @@ run_one()
>  	echo "# $TEST_HDR_MSG"
>  	if [ ! -e "$TEST" ]; then
>  		echo "# Warning: file $TEST is missing!"
> -		echo "not ok $test_num $TEST_HDR_MSG"
> +		report_failure "$test_num $TEST_HDR_MSG"
>  	else
>  		if [ -x /usr/bin/stdbuf ]; then
>  			stdbuf="/usr/bin/stdbuf --output=L "
> @@ -123,7 +129,7 @@ run_one()
>  				interpreter=$(head -n 1 "$TEST" | cut -c 3-)
>  				cmd="$stdbuf $interpreter ./$BASENAME_TEST"
>  			else
> -				echo "not ok $test_num $TEST_HDR_MSG"
> +				report_failure "$test_num $TEST_HDR_MSG"
>  				return
>  			fi
>  		fi
> @@ -137,9 +143,9 @@ run_one()
>  			echo "ok $test_num $TEST_HDR_MSG # SKIP"
>  		elif [ $rc -eq $timeout_rc ]; then \
>  			echo "#"
> -			echo "not ok $test_num $TEST_HDR_MSG # TIMEOUT $kselftest_timeout seconds"
> +			report_failure "$test_num $TEST_HDR_MSG # TIMEOUT $kselftest_timeout seconds"
>  		else
> -			echo "not ok $test_num $TEST_HDR_MSG # exit=$rc"
> +			report_failure "$test_num $TEST_HDR_MSG # exit=$rc"
>  		fi)
>  		cd - >/dev/null
>  	fi
> diff --git a/tools/testing/selftests/run_kselftest.sh b/tools/testing/selftests/run_kselftest.sh
> index 0443beacf3621ae36cb12ffd57f696ddef3526b5..c345f38ad424029bfe50d19b26bdd1d4bda29316 100755
> --- a/tools/testing/selftests/run_kselftest.sh
> +++ b/tools/testing/selftests/run_kselftest.sh
> @@ -36,6 +36,7 @@ Usage: $0 [OPTIONS]
>    -n | --netns			Run each test in namespace
>    -h | --help			Show this usage info
>    -o | --override-timeout	Number of seconds after which we timeout
> +  -e | --error-on-fail	After finishing all tests, exit with code 1 if any failed.

To me it looks like the new behavior could be the default,
removing the need for an additional option.

>  EOF
>  	exit $1
>  }
> @@ -44,6 +45,7 @@ COLLECTIONS=""
>  TESTS=""
>  dryrun=""
>  kselftest_override_timeout=""
> +ERROR_ON_FAIL=false
>  while true; do
>  	case "$1" in
>  		-s | --summary)
> @@ -71,6 +73,9 @@ while true; do
>  		-o | --override-timeout)
>  			kselftest_override_timeout="$2"
>  			shift 2 ;;
> +		-e | --error-on-fail)
> +			ERROR_ON_FAIL="true"
> +			shift ;;
>  		-h | --help)
>  			usage 0 ;;
>  		"")
> @@ -105,9 +110,18 @@ if [ -n "$TESTS" ]; then
>  	available="$(echo "$valid" | sed -e 's/ /\n/g')"
>  fi
>  
> +kselftest_failures_file=$(mktemp --tmpdir kselftest-failures-XXXXXX)

Quoting?

I'm not a fan of the implementation details, but also can't come up with
something better.

> +export kselftest_failures_file
> +
>  collections=$(echo "$available" | cut -d: -f1 | sort | uniq)
>  for collection in $collections ; do
>  	[ -w /dev/kmsg ] && echo "kselftest: Running tests in $collection" >> /dev/kmsg
>  	tests=$(echo "$available" | grep "^$collection:" | cut -d: -f2)
>  	($dryrun cd "$collection" && $dryrun run_many $tests)
>  done
> +
> +failures="$(cat "$kselftest_failures_file")"
> +rm "$kselftest_failures_file"
> +if "$ERROR_ON_FAIL" && [ "$failures" ]; then
> +	exit 1
> +fi
> 
> ---
> base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
> change-id: 20251007-b4-ksft-error-on-fail-0c2cb3246041
> 
> Best regards,
> -- 
> Brendan Jackman <jackmanb@google.com>
>

Re: [PATCH] selftests/run_kselftest.sh: Add --error-on-fail flag

Posted by Brendan Jackman 4 months ago

On Fri Oct 10, 2025 at 6:34 AM UTC, Thomas Weißschuh wrote:
> On Tue, Oct 07, 2025 at 11:06:46AM +0000, Brendan Jackman wrote:
>> +report_failure()
>> +{
>> +	echo "not ok $*" >> "$kselftest_failures_file"
>> +	echo "$*" >> "$kselftest_failures_file"
>
> Both of these go into the failures file. The first should go to stdout, no?

Oops, thanks.

>> --- a/tools/testing/selftests/run_kselftest.sh
>> +++ b/tools/testing/selftests/run_kselftest.sh
>> @@ -36,6 +36,7 @@ Usage: $0 [OPTIONS]
>>    -n | --netns			Run each test in namespace
>>    -h | --help			Show this usage info
>>    -o | --override-timeout	Number of seconds after which we timeout
>> +  -e | --error-on-fail	After finishing all tests, exit with code 1 if any failed.
>
> To me it looks like the new behavior could be the default,
> removing the need for an additional option.

That sounds good to me, I will wait a day or two to see if anyone
objects to this before I send v2.

IMO it's important that your testing systems can tell you the difference
between "tests failed, your code is broken" and "I fell over, the
infrastructure is broken". My thinking was that if anyone's using this
in a proper CI loop, they won't actually want --error-on-fail because
then they sort of lose this distinction (nonzero exit code means "I fell
over" while "your tests failed" is communicated through KTAP). But, now
I think about it, this script is running on the kernel under test so
it's not capable of making this distinction anyway. So default SGTM.

>> +kselftest_failures_file=$(mktemp --tmpdir kselftest-failures-XXXXXX)
>
> Quoting?

Oh yep, thanks again.

> I'm not a fan of the implementation details, but also can't come up with
> something better.

Yeah, I feel the same. I think the next best thing is to rewrite this
whole thing in a proper programming language though. 

Thanks for taking a look :)