tests/qemu-iotests/check | 11 +++++++++-- tests/qemu-iotests/meson.build | 33 ++++++++++++++++++++++++++------ tests/qemu-iotests/testenv.py | 22 +++++++++++++++++---- tests/qemu-iotests/testrunner.py | 1 + 4 files changed, 55 insertions(+), 12 deletions(-)
To just repeat the patch 5 description... Currently meson registers a single test that invokes an entire group of I/O tests, hiding the test granularity from meson. There are various downsides of doing this * You cannot ask 'meson test' to invoke a single I/O test * The meson test timeout can't be applied to the individual tests * Meson only gets a pass/fail for the overall I/O test group not individual tests * If a CI job gets killed by the GitLab timeout, we don't get visibility into how far through the I/O tests execution got. This is not really specific to the I/O tests, the problem is common to any case of us running a test which is in fact another test harness which runs many tests. It would be nice to have meson have the full view of all tests run. Adapting the I/O tests is as easy win in this respect. This switches meson to perform test discovery by invoking 'check' in dry-run mode. It then registers one meson test case for each I/O test. Parallel execution remains disabled since the I/O tests do not use self contained execution environments and thus conflict with each other. Compare contrast output from a current job: https://gitlab.com/qemu-project/qemu/-/jobs/3863603546 [quote] 204/224 qemu:block / qemu-iotests qcow2 OK 329.94s 119 subtests passed [/quote] Vs what is seen with this series: https://gitlab.com/berrange/qemu/-/jobs/3865975463 [quote] 204/350 qemu:block / qemu-iotests-qcow2-001 OK 2.16s 1 subtests passed 205/350 qemu:block / qemu-iotests-qcow2-002 OK 2.77s 1 subtests passed ...snip... 329/350 qemu:block / qemu-iotests-qcow2-qemu-img-close-errors OK 6.19s 1 subtests passed 330/350 qemu:block / qemu-iotests-qcow2-qsd-jobs OK 0.55s 1 subtests passed [/quote] A few tweaks were needed to the iotests runner because it had a few assumptions about it always running in a tree that has already been built, which is obviously not the case at the time meson does test discovery. Daniel P. Berrangé (5): iotests: explicitly pass source/build dir to 'check' command iotests: allow test discovery before building iotests: strip subdir path when listing tests iotests: print TAP protocol version when reporting tests iotests: register each I/O test separately with meson tests/qemu-iotests/check | 11 +++++++++-- tests/qemu-iotests/meson.build | 33 ++++++++++++++++++++++++++------ tests/qemu-iotests/testenv.py | 22 +++++++++++++++++---- tests/qemu-iotests/testrunner.py | 1 + 4 files changed, 55 insertions(+), 12 deletions(-) -- 2.39.2
On 02/03/2023 19.46, Daniel P. Berrangé wrote: > To just repeat the patch 5 description... > > Currently meson registers a single test that invokes an entire group of > I/O tests, hiding the test granularity from meson. There are various > downsides of doing this > > * You cannot ask 'meson test' to invoke a single I/O test > * The meson test timeout can't be applied to the individual > tests > * Meson only gets a pass/fail for the overall I/O test group > not individual tests > * If a CI job gets killed by the GitLab timeout, we don't > get visibility into how far through the I/O tests > execution got. > > This is not really specific to the I/O tests, the problem is common > to any case of us running a test which is in fact another test > harness which runs many tests. It would be nice to have meson have > the full view of all tests run. Adapting the I/O tests is as easy > win in this respect. > > This switches meson to perform test discovery by invoking 'check' in > dry-run mode. It then registers one meson test case for each I/O > test. Parallel execution remains disabled since the I/O tests do not > use self contained execution environments and thus conflict with > each other. Great to see some movement in this area again! Some questions/remarks: 1) Could you remove tests/check-block.sh now? See also: https://lore.kernel.org/all/20220209101530.3442837-9-thuth@redhat.com/ 2) With regards to parallel execution ... I think it should be possible nowadays - the "check" script is normally also run with the "-j" switch by the tests/check-block.sh script, so if you remove the possibility to run in parallel, it's a regression from the previous behavior! 3) When I tried this last year, I had a weird problem that the terminal sometimes gets messed up ... I wasn't able to track it down back then - could you check by running "make check-block" many times (>10 times) to see whether it happens with your series or not? Thomas
On Fri, Mar 03, 2023 at 09:30:39AM +0100, Thomas Huth wrote: > On 02/03/2023 19.46, Daniel P. Berrangé wrote: > 3) When I tried this last year, I had a weird problem that > the terminal sometimes gets messed up ... I wasn't able > to track it down back then - could you check by running > "make check-block" many times (>10 times) to see whether > it happens with your series or not? I've just seen this - echo got disabled on my terminal. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Fri, Mar 03, 2023 at 10:45:40AM +0000, Daniel P. Berrangé wrote: > On Fri, Mar 03, 2023 at 09:30:39AM +0100, Thomas Huth wrote: > > On 02/03/2023 19.46, Daniel P. Berrangé wrote: > > 3) When I tried this last year, I had a weird problem that > > the terminal sometimes gets messed up ... I wasn't able > > to track it down back then - could you check by running > > "make check-block" many times (>10 times) to see whether > > it happens with your series or not? > > I've just seen this - echo got disabled on my terminal. The problem is that testrunner.py script doing # We want to save current tty settings during test run, # since an aborting qemu call may leave things screwed up. @contextmanager def savetty() -> Iterator[None]: isterm = sys.stdin.isatty() if isterm: fd = sys.stdin.fileno() attr = termios.tcgetattr(fd) try: yield finally: if isterm: termios.tcsetattr(fd, termios.TCSADRAIN, attr) When invoking 'check' this wraps around execution of the entire 'check' script. IOW it saves/restores the terminal once. When we invoke 'check' in parallel it will save/restore the same terminal for each invokation. If the 'termios.tcgetattr' call runs at the same time as there is a QEMU running which has put the terminal in raw mode, then when 'check' exits it'll "restore" the terminal to raw mode. IOW, this savetty() logic is fundamentally unsafe when invoking 'check' in parallel. AFAICT, the more reliable thing todo would be to spawn a new PTY when invoking each test, so there's no cleanup required. If QEMU aborts leaving the TTY messed up it dosn't matter was it was a temporary throwaway PTY. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 03/03/2023 14.06, Daniel P. Berrangé wrote: > On Fri, Mar 03, 2023 at 10:45:40AM +0000, Daniel P. Berrangé wrote: >> On Fri, Mar 03, 2023 at 09:30:39AM +0100, Thomas Huth wrote: >>> On 02/03/2023 19.46, Daniel P. Berrangé wrote: >>> 3) When I tried this last year, I had a weird problem that >>> the terminal sometimes gets messed up ... I wasn't able >>> to track it down back then - could you check by running >>> "make check-block" many times (>10 times) to see whether >>> it happens with your series or not? >> >> I've just seen this - echo got disabled on my terminal. > > The problem is that testrunner.py script doing > > # We want to save current tty settings during test run, > # since an aborting qemu call may leave things screwed up. > @contextmanager > def savetty() -> Iterator[None]: > isterm = sys.stdin.isatty() > if isterm: > fd = sys.stdin.fileno() > attr = termios.tcgetattr(fd) > > try: > yield > finally: > if isterm: > termios.tcsetattr(fd, termios.TCSADRAIN, attr) > > > When invoking 'check' this wraps around execution of the entire > 'check' script. IOW it saves/restores the terminal once. > > When we invoke 'check' in parallel it will save/restore the same > terminal for each invokation. > > If the 'termios.tcgetattr' call runs at the same time as there is > a QEMU running which has put the terminal in raw mode, then when > 'check' exits it'll "restore" the terminal to raw mode. > > IOW, this savetty() logic is fundamentally unsafe when invoking > 'check' in parallel. Hmm, couldn't we e.g. also simply skip this termios stuff when running with "--tap" ? Thomas
On Fri, Mar 03, 2023 at 04:49:36PM +0100, Thomas Huth wrote: > On 03/03/2023 14.06, Daniel P. Berrangé wrote: > > On Fri, Mar 03, 2023 at 10:45:40AM +0000, Daniel P. Berrangé wrote: > > > On Fri, Mar 03, 2023 at 09:30:39AM +0100, Thomas Huth wrote: > > > > On 02/03/2023 19.46, Daniel P. Berrangé wrote: > > > > 3) When I tried this last year, I had a weird problem that > > > > the terminal sometimes gets messed up ... I wasn't able > > > > to track it down back then - could you check by running > > > > "make check-block" many times (>10 times) to see whether > > > > it happens with your series or not? > > > > > > I've just seen this - echo got disabled on my terminal. > > > > The problem is that testrunner.py script doing > > > > # We want to save current tty settings during test run, > > # since an aborting qemu call may leave things screwed up. > > @contextmanager > > def savetty() -> Iterator[None]: > > isterm = sys.stdin.isatty() > > if isterm: > > fd = sys.stdin.fileno() > > attr = termios.tcgetattr(fd) > > > > try: > > yield > > finally: > > if isterm: > > termios.tcsetattr(fd, termios.TCSADRAIN, attr) > > > > > > When invoking 'check' this wraps around execution of the entire > > 'check' script. IOW it saves/restores the terminal once. > > > > When we invoke 'check' in parallel it will save/restore the same > > terminal for each invokation. > > > > If the 'termios.tcgetattr' call runs at the same time as there is > > a QEMU running which has put the terminal in raw mode, then when > > 'check' exits it'll "restore" the terminal to raw mode. > > > > IOW, this savetty() logic is fundamentally unsafe when invoking > > 'check' in parallel. > > Hmm, couldn't we e.g. also simply skip this termios stuff when running with > "--tap" ? It actually turns out to be way simpler. We merely need to set stdin=subprocess.DEVNULL. There is no valid reason for the test cases to be reading for stdin, as that would block execution. By setting stdin==/dev/null, the isatty(0) checks will fail and thus QEMU won't mess with termios, and thus we don't need any save/restore dance either. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Fri, Mar 03, 2023 at 09:30:39AM +0100, Thomas Huth wrote: > On 02/03/2023 19.46, Daniel P. Berrangé wrote: > > To just repeat the patch 5 description... > > > > Currently meson registers a single test that invokes an entire group of > > I/O tests, hiding the test granularity from meson. There are various > > downsides of doing this > > > > * You cannot ask 'meson test' to invoke a single I/O test > > * The meson test timeout can't be applied to the individual > > tests > > * Meson only gets a pass/fail for the overall I/O test group > > not individual tests > > * If a CI job gets killed by the GitLab timeout, we don't > > get visibility into how far through the I/O tests > > execution got. > > > > This is not really specific to the I/O tests, the problem is common > > to any case of us running a test which is in fact another test > > harness which runs many tests. It would be nice to have meson have > > the full view of all tests run. Adapting the I/O tests is as easy > > win in this respect. > > > > This switches meson to perform test discovery by invoking 'check' in > > dry-run mode. It then registers one meson test case for each I/O > > test. Parallel execution remains disabled since the I/O tests do not > > use self contained execution environments and thus conflict with > > each other. > > Great to see some movement in this area again! > > Some questions/remarks: > > 1) Could you remove tests/check-block.sh now? See also: > https://lore.kernel.org/all/20220209101530.3442837-9-thuth@redhat.com/ Possibly, I wasn't sure if that was wanted as a general entry point for humans, or was solely for meson ? > 2) With regards to parallel execution ... I think it should be > possible nowadays - the "check" script is normally also run > with the "-j" switch by the tests/check-block.sh script, so > if you remove the possibility to run in parallel, it's a > regression from the previous behavior! Hmmm, I got *masses* of test failures when running in parallel but I'll check again to be sure. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 03/03/2023 09.53, Daniel P. Berrangé wrote: > On Fri, Mar 03, 2023 at 09:30:39AM +0100, Thomas Huth wrote: >> On 02/03/2023 19.46, Daniel P. Berrangé wrote: >>> To just repeat the patch 5 description... >>> >>> Currently meson registers a single test that invokes an entire group of >>> I/O tests, hiding the test granularity from meson. There are various >>> downsides of doing this >>> >>> * You cannot ask 'meson test' to invoke a single I/O test >>> * The meson test timeout can't be applied to the individual >>> tests >>> * Meson only gets a pass/fail for the overall I/O test group >>> not individual tests >>> * If a CI job gets killed by the GitLab timeout, we don't >>> get visibility into how far through the I/O tests >>> execution got. >>> >>> This is not really specific to the I/O tests, the problem is common >>> to any case of us running a test which is in fact another test >>> harness which runs many tests. It would be nice to have meson have >>> the full view of all tests run. Adapting the I/O tests is as easy >>> win in this respect. >>> >>> This switches meson to perform test discovery by invoking 'check' in >>> dry-run mode. It then registers one meson test case for each I/O >>> test. Parallel execution remains disabled since the I/O tests do not >>> use self contained execution environments and thus conflict with >>> each other. >> >> Great to see some movement in this area again! >> >> Some questions/remarks: >> >> 1) Could you remove tests/check-block.sh now? See also: >> https://lore.kernel.org/all/20220209101530.3442837-9-thuth@redhat.com/ > > Possibly, I wasn't sure if that was wanted as a general entry > point for humans, or was solely for meson ? I think this script was only ever used for "make check-block", I never heard of anybody really using this script directly in a regular fashion. Humans rather run the tests/qemu-iotests/check script directly. Also see its origins: https://gitlab.com/qemu-project/qemu/-/commit/b8c6f29eb84cd3ccbbf HTH, Thomas
On Fri, Mar 03, 2023 at 08:53:32AM +0000, Daniel P. Berrangé wrote: > On Fri, Mar 03, 2023 at 09:30:39AM +0100, Thomas Huth wrote: > > On 02/03/2023 19.46, Daniel P. Berrangé wrote: > > > To just repeat the patch 5 description... > > > > > > Currently meson registers a single test that invokes an entire group of > > > I/O tests, hiding the test granularity from meson. There are various > > > downsides of doing this > > > > > > * You cannot ask 'meson test' to invoke a single I/O test > > > * The meson test timeout can't be applied to the individual > > > tests > > > * Meson only gets a pass/fail for the overall I/O test group > > > not individual tests > > > * If a CI job gets killed by the GitLab timeout, we don't > > > get visibility into how far through the I/O tests > > > execution got. > > > > > > This is not really specific to the I/O tests, the problem is common > > > to any case of us running a test which is in fact another test > > > harness which runs many tests. It would be nice to have meson have > > > the full view of all tests run. Adapting the I/O tests is as easy > > > win in this respect. > > > > > > This switches meson to perform test discovery by invoking 'check' in > > > dry-run mode. It then registers one meson test case for each I/O > > > test. Parallel execution remains disabled since the I/O tests do not > > > use self contained execution environments and thus conflict with > > > each other. > > > > Great to see some movement in this area again! > > > > Some questions/remarks: > > > > 1) Could you remove tests/check-block.sh now? See also: > > https://lore.kernel.org/all/20220209101530.3442837-9-thuth@redhat.com/ > > Possibly, I wasn't sure if that was wanted as a general entry > point for humans, or was solely for meson ? > > > 2) With regards to parallel execution ... I think it should be > > possible nowadays - the "check" script is normally also run > > with the "-j" switch by the tests/check-block.sh script, so > > if you remove the possibility to run in parallel, it's a > > regression from the previous behavior! > > Hmmm, I got *masses* of test failures when running in parallel > but I'll check again to be sure. Ooooh, the 'check' script only uses a unique sub-dir for tests when passing -j with a value >= 2. It also merely appends the test name, so it is still broken for anyone who runs multiple copies of 'check' in parallel for different configurations, even if they passed -j. It needs to just always use an isolated subdir no matter what, with process level uniqueness, not merely test. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, Mar 02, 2023 at 06:46:01PM +0000, Daniel P. Berrangé wrote: > To just repeat the patch 5 description... > > Currently meson registers a single test that invokes an entire group of > I/O tests, hiding the test granularity from meson. There are various > downsides of doing this > > * You cannot ask 'meson test' to invoke a single I/O test > * The meson test timeout can't be applied to the individual > tests > * Meson only gets a pass/fail for the overall I/O test group > not individual tests Another big benefit is that we get to see the execution time of every I/O test, so can identify which handful of tests are making it take almost 5 minutes to run in gitlab... The really big 4 are: 199/314 qemu:block / qemu-iotests-qcow2-040 OK 16.62s 1 subtests passed 200/314 qemu:block / qemu-iotests-qcow2-041 OK 44.51s 1 subtests passed 265/314 qemu:block / qemu-iotests-qcow2-191 OK 17.93s 1 subtests passed 282/314 qemu:block / qemu-iotests-qcow2-271 OK 25.03s 1 subtests passed there are a bunch in the ~5 second range, and then most are 1 second or less. > * If a CI job gets killed by the GitLab timeout, we don't > get visibility into how far through the I/O tests > execution got. > > This is not really specific to the I/O tests, the problem is common > to any case of us running a test which is in fact another test > harness which runs many tests. It would be nice to have meson have > the full view of all tests run. Adapting the I/O tests is as easy > win in this respect. > > This switches meson to perform test discovery by invoking 'check' in > dry-run mode. It then registers one meson test case for each I/O > test. Parallel execution remains disabled since the I/O tests do not > use self contained execution environments and thus conflict with > each other. > > Compare contrast output from a current job: > > https://gitlab.com/qemu-project/qemu/-/jobs/3863603546 > > [quote] > 204/224 qemu:block / qemu-iotests qcow2 OK 329.94s 119 subtests passed > [/quote] > > Vs what is seen with this series: > > https://gitlab.com/berrange/qemu/-/jobs/3865975463 > > [quote] > 204/350 qemu:block / qemu-iotests-qcow2-001 OK 2.16s 1 subtests passed > 205/350 qemu:block / qemu-iotests-qcow2-002 OK 2.77s 1 subtests passed > > ...snip... > > 329/350 qemu:block / qemu-iotests-qcow2-qemu-img-close-errors OK 6.19s 1 subtests passed > 330/350 qemu:block / qemu-iotests-qcow2-qsd-jobs OK 0.55s 1 subtests passed > [/quote] > > A few tweaks were needed to the iotests runner because it had a few > assumptions about it always running in a tree that has already been > built, which is obviously not the case at the time meson does test > discovery. > > Daniel P. Berrangé (5): > iotests: explicitly pass source/build dir to 'check' command > iotests: allow test discovery before building > iotests: strip subdir path when listing tests > iotests: print TAP protocol version when reporting tests > iotests: register each I/O test separately with meson > > tests/qemu-iotests/check | 11 +++++++++-- > tests/qemu-iotests/meson.build | 33 ++++++++++++++++++++++++++------ > tests/qemu-iotests/testenv.py | 22 +++++++++++++++++---- > tests/qemu-iotests/testrunner.py | 1 + > 4 files changed, 55 insertions(+), 12 deletions(-) > > -- > 2.39.2 > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
© 2016 - 2024 Red Hat, Inc.