[PATCH v4 08/15] kunit: tool: Don't overwrite test status based on subtest counts

Thomas Weißschuh posted 15 patches 3 months, 2 weeks ago
There is a newer version of this series
[PATCH v4 08/15] kunit: tool: Don't overwrite test status based on subtest counts
Posted by Thomas Weißschuh 3 months, 2 weeks ago
If a subtest itself reports success, but the outer testcase fails,
the whole testcase should be reported as a failure.
However the status is recalculated based on the test counts,
overwriting the outer test result.
Synthesize a failed test in this case to make sure the failure is not
swallowed.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
---
 tools/testing/kunit/kunit_parser.py                                  | 5 +++++
 tools/testing/kunit/kunit_tool_test.py                               | 3 ++-
 tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log | 3 +++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
index c176487356e6c94882046b19ea696d750905b8d5..2478beb28fc3db825855ad46200340e884da7df1 100644
--- a/tools/testing/kunit/kunit_parser.py
+++ b/tools/testing/kunit/kunit_parser.py
@@ -686,6 +686,11 @@ def bubble_up_test_results(test: Test) -> None:
 		counts.add_status(status)
 	elif test.counts.get_status() == TestStatus.TEST_CRASHED:
 		test.status = TestStatus.TEST_CRASHED
+	if not test.ok_status():
+		for t in subtests:
+			if not t.ok_status():
+				counts.add_status(t.status)
+				break
 
 def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest: bool, printer: Printer) -> Test:
 	"""
diff --git a/tools/testing/kunit/kunit_tool_test.py b/tools/testing/kunit/kunit_tool_test.py
index b74dc05fc2fe5b3ff629172fc7aafeb5c3d29fb3..48a0dd0f9c87caf9f018aade161db90a613fc407 100755
--- a/tools/testing/kunit/kunit_tool_test.py
+++ b/tools/testing/kunit/kunit_tool_test.py
@@ -170,8 +170,9 @@ class KUnitParserTest(unittest.TestCase):
 		with open(nested_log) as file:
 			result = kunit_parser.parse_run_tests(file.readlines(), stdout)
 		self.assertEqual(kunit_parser.TestStatus.FAILURE, result.status)
-		self.assertEqual(result.counts.failed, 2)
+		self.assertEqual(result.counts.failed, 3)
 		self.assertEqual(kunit_parser.TestStatus.FAILURE, result.subtests[0].status)
+		self.assertEqual(kunit_parser.TestStatus.SUCCESS, result.subtests[0].subtests[0].status)
 		self.assertEqual(kunit_parser.TestStatus.FAILURE, result.subtests[1].status)
 		self.assertEqual(kunit_parser.TestStatus.FAILURE, result.subtests[1].subtests[0].status)
 
diff --git a/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log b/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log
index 2e528da39ab5b2be0fca6cf9160c10929fba3c9e..5498dfd0b0db24663e1a1e9bf78c587de6746522 100644
--- a/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log
+++ b/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log
@@ -1,5 +1,8 @@
 KTAP version 1
 1..2
+    KTAP version 1
+    1..1
+        ok 1 test 1
 not ok 1 subtest 1
     KTAP version 1
     1..1

-- 
2.50.0

Re: [PATCH v4 08/15] kunit: tool: Don't overwrite test status based on subtest counts
Posted by Rae Moar 3 months, 1 week ago
On Thu, Jun 26, 2025 at 2:10 AM Thomas Weißschuh
<thomas.weissschuh@linutronix.de> wrote:
>
> If a subtest itself reports success, but the outer testcase fails,
> the whole testcase should be reported as a failure.
> However the status is recalculated based on the test counts,
> overwriting the outer test result.
> Synthesize a failed test in this case to make sure the failure is not
> swallowed.

Hello!

This is a very exciting patch series! However, I have a few concerns
with this patch.

When I parse the following KTAP with this change:

KTAP version 1
1..2
    KTAP version 1
    1..2
        ok 1 test 1
        not ok 2 test 2
not ok 1 subtest 1
    KTAP version 1
    1..1
        not ok 1 subsubtest 1
not ok 2 subtest 2

The output is:

[20:54:12] ============================================================
[20:54:12] ======================= (2 subtests) =======================
[20:54:12] [PASSED] test 1
[20:54:12] [FAILED] test 2
[20:54:12] ==================== [FAILED] subtest 1 ====================
[20:54:12] ======================= (1 subtest) ========================
[20:54:12] [FAILED] subsubtest 1
[20:54:12] ==================== [FAILED] subtest 2 ====================
[20:54:12] ============================================================
[20:54:12] Testing complete. Ran 6 tests: passed: 1, failed: 5

This reports a total of 6 tests, which is not equivalent to the three
subtests plus the two suites. I believe this is because the change to
bubble_up_test_results below double counts the failed test case.

Historically, the KUnit parser only counts the results of test cases,
not the suites. I would like to stay as close to this as possible so
as to not inflate existing testing numbers. However, I believe the
main concern here is the case where if there is a suite reporting
failure but all subtests pass, it will not appear in the summary line.
For example,

KTAP version 1
1..1
    KTAP version 1
    1..1
        ok 1 test 1
not ok 1 subtest 1

Reporting: All passing: Tests run: 1, passed: 1

This is absolutely an important edge case to cover. Therefore, we
should add 1 failure count to the suite count if the bubbled up
results indicate it should instead pass.

Thanks!
-Rae

>
> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
> Reviewed-by: David Gow <davidgow@google.com>
> ---
>  tools/testing/kunit/kunit_parser.py                                  | 5 +++++
>  tools/testing/kunit/kunit_tool_test.py                               | 3 ++-
>  tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log | 3 +++
>  3 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
> index c176487356e6c94882046b19ea696d750905b8d5..2478beb28fc3db825855ad46200340e884da7df1 100644
> --- a/tools/testing/kunit/kunit_parser.py
> +++ b/tools/testing/kunit/kunit_parser.py
> @@ -686,6 +686,11 @@ def bubble_up_test_results(test: Test) -> None:
>                 counts.add_status(status)
>         elif test.counts.get_status() == TestStatus.TEST_CRASHED:
>                 test.status = TestStatus.TEST_CRASHED
> +       if not test.ok_status():
> +               for t in subtests:
> +                       if not t.ok_status():
> +                               counts.add_status(t.status)
> +                               break

Here instead I recommend checking if not test.ok_status() and
test.counts.get_status() == TestStatus.SUCCESS and if so
counts.add_status(status)

>
>  def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest: bool, printer: Printer) -> Test:
>         """
> diff --git a/tools/testing/kunit/kunit_tool_test.py b/tools/testing/kunit/kunit_tool_test.py
> index b74dc05fc2fe5b3ff629172fc7aafeb5c3d29fb3..48a0dd0f9c87caf9f018aade161db90a613fc407 100755
> --- a/tools/testing/kunit/kunit_tool_test.py
> +++ b/tools/testing/kunit/kunit_tool_test.py
> @@ -170,8 +170,9 @@ class KUnitParserTest(unittest.TestCase):
>                 with open(nested_log) as file:
>                         result = kunit_parser.parse_run_tests(file.readlines(), stdout)
>                 self.assertEqual(kunit_parser.TestStatus.FAILURE, result.status)
> -               self.assertEqual(result.counts.failed, 2)
> +               self.assertEqual(result.counts.failed, 3)
>                 self.assertEqual(kunit_parser.TestStatus.FAILURE, result.subtests[0].status)
> +               self.assertEqual(kunit_parser.TestStatus.SUCCESS, result.subtests[0].subtests[0].status)
>                 self.assertEqual(kunit_parser.TestStatus.FAILURE, result.subtests[1].status)
>                 self.assertEqual(kunit_parser.TestStatus.FAILURE, result.subtests[1].subtests[0].status)
>
> diff --git a/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log b/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log
> index 2e528da39ab5b2be0fca6cf9160c10929fba3c9e..5498dfd0b0db24663e1a1e9bf78c587de6746522 100644
> --- a/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log
> +++ b/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log
> @@ -1,5 +1,8 @@
>  KTAP version 1
>  1..2
> +    KTAP version 1
> +    1..1
> +        ok 1 test 1
>  not ok 1 subtest 1
>      KTAP version 1
>      1..1
>
> --
> 2.50.0
>
> --
> You received this message because you are subscribed to the Google Groups "KUnit Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kunit-dev+unsubscribe@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/kunit-dev/20250626-kunit-kselftests-v4-8-48760534fef5%40linutronix.de.
Re: [PATCH v4 08/15] kunit: tool: Don't overwrite test status based on subtest counts
Posted by Thomas Weißschuh 3 months, 1 week ago
On Tue, Jul 01, 2025 at 05:11:59PM -0400, Rae Moar wrote:
> On Thu, Jun 26, 2025 at 2:10 AM Thomas Weißschuh
> <thomas.weissschuh@linutronix.de> wrote:
> >
> > If a subtest itself reports success, but the outer testcase fails,
> > the whole testcase should be reported as a failure.
> > However the status is recalculated based on the test counts,
> > overwriting the outer test result.
> > Synthesize a failed test in this case to make sure the failure is not
> > swallowed.
> 
> This is a very exciting patch series! However, I have a few concerns
> with this patch.

Thanks for the review!

> When I parse the following KTAP with this change:
> 
> KTAP version 1
> 1..2
>     KTAP version 1
>     1..2
>         ok 1 test 1
>         not ok 2 test 2
> not ok 1 subtest 1
>     KTAP version 1
>     1..1
>         not ok 1 subsubtest 1
> not ok 2 subtest 2
> 
> The output is:
> 
> [20:54:12] ============================================================
> [20:54:12] ======================= (2 subtests) =======================
> [20:54:12] [PASSED] test 1
> [20:54:12] [FAILED] test 2
> [20:54:12] ==================== [FAILED] subtest 1 ====================
> [20:54:12] ======================= (1 subtest) ========================
> [20:54:12] [FAILED] subsubtest 1
> [20:54:12] ==================== [FAILED] subtest 2 ====================
> [20:54:12] ============================================================
> [20:54:12] Testing complete. Ran 6 tests: passed: 1, failed: 5
> 
> This reports a total of 6 tests, which is not equivalent to the three
> subtests plus the two suites. I believe this is because the change to
> bubble_up_test_results below double counts the failed test case.
> 
> Historically, the KUnit parser only counts the results of test cases,
> not the suites. I would like to stay as close to this as possible so
> as to not inflate existing testing numbers. However, I believe the
> main concern here is the case where if there is a suite reporting
> failure but all subtests pass, it will not appear in the summary line.
> For example,
> 
> KTAP version 1
> 1..1
>     KTAP version 1
>     1..1
>         ok 1 test 1
> not ok 1 subtest 1
> 
> Reporting: All passing: Tests run: 1, passed: 1
> 
> This is absolutely an important edge case to cover. Therefore, we
> should add 1 failure count to the suite count if the bubbled up
> results indicate it should instead pass.

Makes sense.

> >
> > Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
> > Reviewed-by: David Gow <davidgow@google.com>
> > ---
> >  tools/testing/kunit/kunit_parser.py                                  | 5 +++++
> >  tools/testing/kunit/kunit_tool_test.py                               | 3 ++-
> >  tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log | 3 +++
> >  3 files changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
> > index c176487356e6c94882046b19ea696d750905b8d5..2478beb28fc3db825855ad46200340e884da7df1 100644
> > --- a/tools/testing/kunit/kunit_parser.py
> > +++ b/tools/testing/kunit/kunit_parser.py
> > @@ -686,6 +686,11 @@ def bubble_up_test_results(test: Test) -> None:
> >                 counts.add_status(status)
> >         elif test.counts.get_status() == TestStatus.TEST_CRASHED:
> >                 test.status = TestStatus.TEST_CRASHED
> > +       if not test.ok_status():
> > +               for t in subtests:
> > +                       if not t.ok_status():
> > +                               counts.add_status(t.status)
> > +                               break
> 
> Here instead I recommend checking if not test.ok_status() and
> test.counts.get_status() == TestStatus.SUCCESS and if so
> counts.add_status(status)

Thanks for the recommendation. I tried this and it works well for this specific
testcase, but unfortunately all kinds of othes tests are now broken.
I'll look into it some more, but any hints are highly appreciated.
It has been a while since I looked at the code.

> >  def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest: bool, printer: Printer) -> Test:
> >         """
> > diff --git a/tools/testing/kunit/kunit_tool_test.py b/tools/testing/kunit/kunit_tool_test.py
> > index b74dc05fc2fe5b3ff629172fc7aafeb5c3d29fb3..48a0dd0f9c87caf9f018aade161db90a613fc407 100755
> > --- a/tools/testing/kunit/kunit_tool_test.py
> > +++ b/tools/testing/kunit/kunit_tool_test.py
> > @@ -170,8 +170,9 @@ class KUnitParserTest(unittest.TestCase):
> >                 with open(nested_log) as file:
> >                         result = kunit_parser.parse_run_tests(file.readlines(), stdout)
> >                 self.assertEqual(kunit_parser.TestStatus.FAILURE, result.status)
> > -               self.assertEqual(result.counts.failed, 2)
> > +               self.assertEqual(result.counts.failed, 3)
> >                 self.assertEqual(kunit_parser.TestStatus.FAILURE, result.subtests[0].status)
> > +               self.assertEqual(kunit_parser.TestStatus.SUCCESS, result.subtests[0].subtests[0].status)
> >                 self.assertEqual(kunit_parser.TestStatus.FAILURE, result.subtests[1].status)
> >                 self.assertEqual(kunit_parser.TestStatus.FAILURE, result.subtests[1].subtests[0].status)
> >
> > diff --git a/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log b/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log
> > index 2e528da39ab5b2be0fca6cf9160c10929fba3c9e..5498dfd0b0db24663e1a1e9bf78c587de6746522 100644
> > --- a/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log
> > +++ b/tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log
> > @@ -1,5 +1,8 @@
> >  KTAP version 1
> >  1..2
> > +    KTAP version 1
> > +    1..1
> > +        ok 1 test 1
> >  not ok 1 subtest 1
> >      KTAP version 1
> >      1..1
> >
> > --
> > 2.50.0
> >
> > --
> > You received this message because you are subscribed to the Google Groups "KUnit Development" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to kunit-dev+unsubscribe@googlegroups.com.
> > To view this discussion visit https://groups.google.com/d/msgid/kunit-dev/20250626-kunit-kselftests-v4-8-48760534fef5%40linutronix.de.
Re: [PATCH v4 08/15] kunit: tool: Don't overwrite test status based on subtest counts
Posted by Thomas Weißschuh 3 months ago
Hi Rae,

On Thu, Jul 03, 2025 at 05:30:02PM +0200, Thomas Weißschuh wrote:
> On Tue, Jul 01, 2025 at 05:11:59PM -0400, Rae Moar wrote:

<snip>

> > > Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
> > > Reviewed-by: David Gow <davidgow@google.com>
> > > ---
> > >  tools/testing/kunit/kunit_parser.py                                  | 5 +++++
> > >  tools/testing/kunit/kunit_tool_test.py                               | 3 ++-
> > >  tools/testing/kunit/test_data/test_is_test_passed-failure-nested.log | 3 +++
> > >  3 files changed, 10 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
> > > index c176487356e6c94882046b19ea696d750905b8d5..2478beb28fc3db825855ad46200340e884da7df1 100644
> > > --- a/tools/testing/kunit/kunit_parser.py
> > > +++ b/tools/testing/kunit/kunit_parser.py
> > > @@ -686,6 +686,11 @@ def bubble_up_test_results(test: Test) -> None:
> > >                 counts.add_status(status)
> > >         elif test.counts.get_status() == TestStatus.TEST_CRASHED:
> > >                 test.status = TestStatus.TEST_CRASHED
> > > +       if not test.ok_status():
> > > +               for t in subtests:
> > > +                       if not t.ok_status():
> > > +                               counts.add_status(t.status)
> > > +                               break
> > 
> > Here instead I recommend checking if not test.ok_status() and
> > test.counts.get_status() == TestStatus.SUCCESS and if so
> > counts.add_status(status)
> 
> Thanks for the recommendation. I tried this and it works well for this specific
> testcase, but unfortunately all kinds of othes tests are now broken.
> I'll look into it some more, but any hints are highly appreciated.
> It has been a while since I looked at the code.

The following variant passes all tests. What do you think?

diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
index 333cd3a4a56b..5338489dcbe4 100644
--- a/tools/testing/kunit/kunit_parser.py
+++ b/tools/testing/kunit/kunit_parser.py
@@ -689,6 +689,9 @@ def bubble_up_test_results(test: Test) -> None:
        elif test.counts.get_status() == TestStatus.TEST_CRASHED:
                test.status = TestStatus.TEST_CRASHED
 
+       if status == TestStatus.FAILURE and test.counts.get_status() == TestStatus.SUCCESS:
+               counts.add_status(status)
+