Add a test of the NeXTcube framebuffer using the Tesseract OCR
engine on a screenshot of the framebuffer device.
The test is very quick:
$ avocado --show=app,ocr run tests/acceptance/machine_m68k_nextcube.py
JOB ID : f7d3c27976047036dc568183baf64c04863d9985
JOB LOG : ~/avocado/job-results/job-2019-06-29T16.18-f7d3c27/job.log
(1/1) tests/acceptance/machine_m68k_nextcube.py:NextCubeMachine.test_bootrom_framebuffer: |ocr:
ue r pun Honl'flx ; 5‘ 55‘
avg ncaaaaa 25 MHZ, memary jag m
Backplane slat «a
Ethernet address a a r a r3 2
Memgry sackets aea canflqured far 16MB Darlly page made stMs but have 16MB page made stMs )nstalled
Memgry sackets a and 1 canflqured far 16MB Darlly page made stMs but have 16MB page made stMs )nstalled
[...]
Yestlnq the rpu, 5::
system test raneg Errar egge 51
Egg: cammand
Default pggc devlce nut fauna
NEXY>I
PASS (3.59 s)
RESULTS : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB TIME : 3.97 s
Documentation on how to install tesseract:
https://github.com/tesseract-ocr/tesseract/wiki#installation
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
v2:
- test fb sizes
- handle 2 versions of teseract
---
tests/acceptance/machine_m68k_nextcube.py | 102 ++++++++++++++++++++++
1 file changed, 102 insertions(+)
create mode 100644 tests/acceptance/machine_m68k_nextcube.py
diff --git a/tests/acceptance/machine_m68k_nextcube.py b/tests/acceptance/machine_m68k_nextcube.py
new file mode 100644
index 0000000000..f8e514a058
--- /dev/null
+++ b/tests/acceptance/machine_m68k_nextcube.py
@@ -0,0 +1,102 @@
+# Functional test that boots a VM and run OCR on the framebuffer
+#
+# Copyright (c) Philippe Mathieu-Daudé <f4bug@amsat.org>
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later. See the COPYING file in the top-level directory.
+
+import logging
+import time
+import distutils.spawn
+
+from avocado import skipUnless
+from avocado_qemu import Test
+from avocado.utils import process
+
+try:
+ from PIL import Image
+ pil_available = True
+except ImportError:
+ pil_available = False
+
+
+def tesseract_available(expected_version):
+ if not distutils.spawn.find_executable('tesseract'):
+ return False
+ res = process.run('tesseract --version')
+ try:
+ version = res.stdout_text.split()[1]
+ except IndexError:
+ version = res.stderr_text.split()[1]
+ return int(version.split('.')[0]) == expected_version
+
+
+class NextCubeMachine(Test):
+
+ timeout = 15
+
+ def check_bootrom_framebuffer(self, screenshot_path):
+ rom_url = ('http://www.nextcomputers.org/NeXTfiles/Software/ROM_Files/'
+ '68040_Non-Turbo_Chipset/Rev_2.5_v66.BIN')
+ rom_hash = 'b3534796abae238a0111299fc406a9349f7fee24'
+ rom_path = self.fetch_asset(rom_url, asset_hash=rom_hash)
+
+ self.vm.set_machine('next-cube')
+ self.vm.add_args('-bios', rom_path)
+ self.vm.launch()
+
+ self.log.info('VM launched, waiting for display')
+ # FIXME how to catch the 'displaysurface_create 1120x832' trace-event?
+ time.sleep(2)
+
+ self.vm.command('human-monitor-command',
+ command_line='screendump %s' % screenshot_path)
+
+ @skipUnless(pil_available, 'Python PIL not installed')
+ def test_bootrom_framebuffer_size(self):
+ """
+ :avocado: tags=arch:m68k
+ :avocado: tags=machine:next-cube
+ :avocado: tags=device:framebuffer
+ """
+ screenshot_path = self.workdir + "dump"
+ self.check_bootrom_framebuffer(screenshot_path)
+
+ width, height = Image.open(screenshot_path).size
+ self.assertEqual(width, 1120)
+ self.assertEqual(height, 832)
+
+ @skipUnless(tesseract_available(3), 'tesseract v3 OCR tool not available')
+ def test_bootrom_framebuffer_ocr_with_tesseract_v3(self):
+ """
+ :avocado: tags=arch:m68k
+ :avocado: tags=machine:next-cube
+ :avocado: tags=device:framebuffer
+ """
+ screenshot_path = self.workdir + "dump"
+ self.check_bootrom_framebuffer(screenshot_path)
+
+ console_logger = logging.getLogger('ocr')
+ text = process.run("tesseract %s stdout" % screenshot_path).stdout_text
+ console_logger.debug(text)
+ self.assertIn('Backplane', text)
+ self.assertIn('Ethernet address', text)
+
+ @skipUnless(tesseract_available(4), 'tesseract v4 OCR tool not available')
+ def test_bootrom_framebuffer_ocr_with_tesseract_v4(self):
+ """
+ :avocado: tags=arch:m68k
+ :avocado: tags=machine:next-cube
+ :avocado: tags=device:framebuffer
+ """
+ screenshot_path = self.workdir + "dump"
+ self.check_bootrom_framebuffer(screenshot_path)
+
+ console_logger = logging.getLogger('ocr')
+ proc = process.run("tesseract --oem 1 %s stdout" % screenshot_path)
+ text = proc.stdout_text
+ console_logger.debug(text)
+ self.assertIn('Testing the FPU, SCC', text)
+ self.assertIn('System test failed. Error code 51', text)
+ self.assertIn('Boot command', text)
+ self.assertIn('Next>', text)
--
2.19.1
On Mon, Jul 01, 2019 at 05:34:35PM +0200, Philippe Mathieu-Daudé wrote:
> Add a test of the NeXTcube framebuffer using the Tesseract OCR
> engine on a screenshot of the framebuffer device.
>
> The test is very quick:
>
> $ avocado --show=app,ocr run tests/acceptance/machine_m68k_nextcube.py
Shouldn't we stick to "console" here? I understand we're resorting to ocr
but the content, for what it's worth, should be the same as in the console
for other tests. This allows a common expectation across tests too.
> JOB ID : f7d3c27976047036dc568183baf64c04863d9985
> JOB LOG : ~/avocado/job-results/job-2019-06-29T16.18-f7d3c27/job.log
> (1/1) tests/acceptance/machine_m68k_nextcube.py:NextCubeMachine.test_bootrom_framebuffer: |ocr:
> ue r pun Honl'flx ; 5‘ 55‘
> avg ncaaaaa 25 MHZ, memary jag m
> Backplane slat «a
> Ethernet address a a r a r3 2
> Memgry sackets aea canflqured far 16MB Darlly page made stMs but have 16MB page made stMs )nstalled
> Memgry sackets a and 1 canflqured far 16MB Darlly page made stMs but have 16MB page made stMs )nstalled
> [...]
> Yestlnq the rpu, 5::
> system test raneg Errar egge 51
> Egg: cammand
> Default pggc devlce nut fauna
> NEXY>I
> PASS (3.59 s)
> RESULTS : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
> JOB TIME : 3.97 s
>
> Documentation on how to install tesseract:
> https://github.com/tesseract-ocr/tesseract/wiki#installation
>
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> v2:
> - test fb sizes
> - handle 2 versions of teseract
> ---
> tests/acceptance/machine_m68k_nextcube.py | 102 ++++++++++++++++++++++
> 1 file changed, 102 insertions(+)
> create mode 100644 tests/acceptance/machine_m68k_nextcube.py
>
> diff --git a/tests/acceptance/machine_m68k_nextcube.py b/tests/acceptance/machine_m68k_nextcube.py
> new file mode 100644
> index 0000000000..f8e514a058
> --- /dev/null
> +++ b/tests/acceptance/machine_m68k_nextcube.py
> @@ -0,0 +1,102 @@
> +# Functional test that boots a VM and run OCR on the framebuffer
> +#
> +# Copyright (c) Philippe Mathieu-Daudé <f4bug@amsat.org>
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later. See the COPYING file in the top-level directory.
> +
> +import logging
> +import time
> +import distutils.spawn
> +
> +from avocado import skipUnless
> +from avocado_qemu import Test
> +from avocado.utils import process
Style nitpick:
from avocado_qemu import Test
from avocado import skipUnless
from avocado.utils import process
> +
> +try:
> + from PIL import Image
> + pil_available = True
Another style nitpick, but very minor issue is the use of ALL_CAPS
variables if they are at the module level. So that would become
PIL_AVAILABLE = True
> +except ImportError:
> + pil_available = False
> +
> +
> +def tesseract_available(expected_version):
> + if not distutils.spawn.find_executable('tesseract'):
Just though of pointing out that there's a similar function in
avocado.utils.path, called find_command:
https://avocado-framework.readthedocs.io/en/68.0/api/utils/avocado.utils.html#avocado.utils.path.find_command
Feel free to pick your poison! :)
> + return False
> + res = process.run('tesseract --version')
> + try:
> + version = res.stdout_text.split()[1]
> + except IndexError:
> + version = res.stderr_text.split()[1]
Do some versions write this to stdout and others to stderr?
> + return int(version.split('.')[0]) == expected_version
This can raise an exception if some other sort of output is
produced. How about:
import re
match = re.match(r'tesseract\s(\d)', res)
if match is not None:
# now this is guaranteed to be a digit
if int(match.groups()[0]) == expected_version:
return True
return False
> +
> +
> +class NextCubeMachine(Test):
> +
> + timeout = 15
> +
> + def check_bootrom_framebuffer(self, screenshot_path):
> + rom_url = ('http://www.nextcomputers.org/NeXTfiles/Software/ROM_Files/'
> + '68040_Non-Turbo_Chipset/Rev_2.5_v66.BIN')
> + rom_hash = 'b3534796abae238a0111299fc406a9349f7fee24'
> + rom_path = self.fetch_asset(rom_url, asset_hash=rom_hash)
> +
> + self.vm.set_machine('next-cube')
> + self.vm.add_args('-bios', rom_path)
> + self.vm.launch()
> +
> + self.log.info('VM launched, waiting for display')
> + # FIXME how to catch the 'displaysurface_create 1120x832' trace-event?
> + time.sleep(2)
There's avocado.utils.wait.wait_for() to *help* with waiting, but I'm
not sure about the trace-events.
> +
> + self.vm.command('human-monitor-command',
> + command_line='screendump %s' % screenshot_path)
> +
> + @skipUnless(pil_available, 'Python PIL not installed')
> + def test_bootrom_framebuffer_size(self):
> + """
> + :avocado: tags=arch:m68k
> + :avocado: tags=machine:next-cube
Here we hit the syntax limitation of the Avocado tags regex again:
https://avocado-framework.readthedocs.io/en/68.0/api/core/avocado.core.html#avocado.core.safeloader.DOCSTRING_DIRECTIVE_RE_RAW
I'll look into raising that limitation on the next release, but,
for the time being, this will need to be:
:avocado: tags=machine:next_cube
The same applies to the other tests, of course.
> + :avocado: tags=device:framebuffer
> + """
> + screenshot_path = self.workdir + "dump"
Best practice is to use os.path.join() instead.
> + self.check_bootrom_framebuffer(screenshot_path)
> +
> + width, height = Image.open(screenshot_path).size
> + self.assertEqual(width, 1120)
> + self.assertEqual(height, 832)
> +
> + @skipUnless(tesseract_available(3), 'tesseract v3 OCR tool not available')
> + def test_bootrom_framebuffer_ocr_with_tesseract_v3(self):
> + """
> + :avocado: tags=arch:m68k
> + :avocado: tags=machine:next-cube
> + :avocado: tags=device:framebuffer
> + """
> + screenshot_path = self.workdir + "dump"
> + self.check_bootrom_framebuffer(screenshot_path)
> +
> + console_logger = logging.getLogger('ocr')
> + text = process.run("tesseract %s stdout" % screenshot_path).stdout_text
> + console_logger.debug(text)
> + self.assertIn('Backplane', text)
> + self.assertIn('Ethernet address', text)
I haven't tried v3, but I'm curious... is this about the change in
command line syntax only? Or do v3 and v4 are able to recognize
different characters?
- Cleber.
> +
> + @skipUnless(tesseract_available(4), 'tesseract v4 OCR tool not available')
> + def test_bootrom_framebuffer_ocr_with_tesseract_v4(self):
> + """
> + :avocado: tags=arch:m68k
> + :avocado: tags=machine:next-cube
> + :avocado: tags=device:framebuffer
> + """
> + screenshot_path = self.workdir + "dump"
> + self.check_bootrom_framebuffer(screenshot_path)
> +
> + console_logger = logging.getLogger('ocr')
> + proc = process.run("tesseract --oem 1 %s stdout" % screenshot_path)
> + text = proc.stdout_text
> + console_logger.debug(text)
> + self.assertIn('Testing the FPU, SCC', text)
> + self.assertIn('System test failed. Error code 51', text)
> + self.assertIn('Boot command', text)
> + self.assertIn('Next>', text)
> --
> 2.19.1
>
On 7/1/19 10:09 PM, Cleber Rosa wrote:
> On Mon, Jul 01, 2019 at 05:34:35PM +0200, Philippe Mathieu-Daudé wrote:
>> Add a test of the NeXTcube framebuffer using the Tesseract OCR
>> engine on a screenshot of the framebuffer device.
>>
>> The test is very quick:
>>
>> $ avocado --show=app,ocr run tests/acceptance/machine_m68k_nextcube.py
>
> Shouldn't we stick to "console" here? I understand we're resorting to ocr
> but the content, for what it's worth, should be the same as in the console
> for other tests. This allows a common expectation across tests too.
>
>> JOB ID : f7d3c27976047036dc568183baf64c04863d9985
>> JOB LOG : ~/avocado/job-results/job-2019-06-29T16.18-f7d3c27/job.log
>> (1/1) tests/acceptance/machine_m68k_nextcube.py:NextCubeMachine.test_bootrom_framebuffer: |ocr:
>> ue r pun Honl'flx ; 5‘ 55‘
>> avg ncaaaaa 25 MHZ, memary jag m
>> Backplane slat «a
>> Ethernet address a a r a r3 2
>> Memgry sackets aea canflqured far 16MB Darlly page made stMs but have 16MB page made stMs )nstalled
>> Memgry sackets a and 1 canflqured far 16MB Darlly page made stMs but have 16MB page made stMs )nstalled
>> [...]
>> Yestlnq the rpu, 5::
>> system test raneg Errar egge 51
>> Egg: cammand
>> Default pggc devlce nut fauna
>> NEXY>I
>> PASS (3.59 s)
>> RESULTS : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
>> JOB TIME : 3.97 s
>>
>> Documentation on how to install tesseract:
>> https://github.com/tesseract-ocr/tesseract/wiki#installation
>>
>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>> ---
>> v2:
>> - test fb sizes
>> - handle 2 versions of teseract
>> ---
>> tests/acceptance/machine_m68k_nextcube.py | 102 ++++++++++++++++++++++
>> 1 file changed, 102 insertions(+)
>> create mode 100644 tests/acceptance/machine_m68k_nextcube.py
>>
>> diff --git a/tests/acceptance/machine_m68k_nextcube.py b/tests/acceptance/machine_m68k_nextcube.py
>> new file mode 100644
>> index 0000000000..f8e514a058
>> --- /dev/null
>> +++ b/tests/acceptance/machine_m68k_nextcube.py
>> @@ -0,0 +1,102 @@
>> +# Functional test that boots a VM and run OCR on the framebuffer
>> +#
>> +# Copyright (c) Philippe Mathieu-Daudé <f4bug@amsat.org>
>> +#
>> +# This work is licensed under the terms of the GNU GPL, version 2 or
>> +# later. See the COPYING file in the top-level directory.
>> +
>> +import logging
>> +import time
>> +import distutils.spawn
>> +
>> +from avocado import skipUnless
>> +from avocado_qemu import Test
>> +from avocado.utils import process
>
> Style nitpick:
>
> from avocado_qemu import Test
> from avocado import skipUnless
> from avocado.utils import process
What is the logic here?
>> +
>> +try:
>> + from PIL import Image
>> + pil_available = True
>
> Another style nitpick, but very minor issue is the use of ALL_CAPS
> variables if they are at the module level. So that would become
>
> PIL_AVAILABLE = True
>
>> +except ImportError:
>> + pil_available = False
OK.
>> +
>> +
>> +def tesseract_available(expected_version):
>> + if not distutils.spawn.find_executable('tesseract'):
>
> Just though of pointing out that there's a similar function in
> avocado.utils.path, called find_command:
>
> https://avocado-framework.readthedocs.io/en/68.0/api/utils/avocado.utils.html#avocado.utils.path.find_command
>
> Feel free to pick your poison! :)
OK.
>> + return False
>> + res = process.run('tesseract --version')
>> + try:
>> + version = res.stdout_text.split()[1]
>> + except IndexError:
>> + version = res.stderr_text.split()[1]
>
> Do some versions write this to stdout and others to stderr?
Yes...
v3: stderr
v4: stdout
>> + return int(version.split('.')[0]) == expected_version
>
> This can raise an exception if some other sort of output is
> produced. How about:
>
> import re
>
> match = re.match(r'tesseract\s(\d)', res)
> if match is not None:
> # now this is guaranteed to be a digit
> if int(match.groups()[0]) == expected_version:
> return True
> return False
I wanted to avoid regex, but OK.
>> +
>> +
>> +class NextCubeMachine(Test):
>> +
>> + timeout = 15
>> +
>> + def check_bootrom_framebuffer(self, screenshot_path):
>> + rom_url = ('http://www.nextcomputers.org/NeXTfiles/Software/ROM_Files/'
>> + '68040_Non-Turbo_Chipset/Rev_2.5_v66.BIN')
>> + rom_hash = 'b3534796abae238a0111299fc406a9349f7fee24'
>> + rom_path = self.fetch_asset(rom_url, asset_hash=rom_hash)
>> +
>> + self.vm.set_machine('next-cube')
>> + self.vm.add_args('-bios', rom_path)
>> + self.vm.launch()
>> +
>> + self.log.info('VM launched, waiting for display')
>> + # FIXME how to catch the 'displaysurface_create 1120x832' trace-event?
>> + time.sleep(2)
>
> There's avocado.utils.wait.wait_for() to *help* with waiting, but I'm
> not sure about the trace-events.
trace-events can be logged into a file, so as with chardev I'd like to
use a pipe and monitor it in parallel.
>> +
>> + self.vm.command('human-monitor-command',
>> + command_line='screendump %s' % screenshot_path)
>> +
>> + @skipUnless(pil_available, 'Python PIL not installed')
>> + def test_bootrom_framebuffer_size(self):
>> + """
>> + :avocado: tags=arch:m68k
>> + :avocado: tags=machine:next-cube
>
> Here we hit the syntax limitation of the Avocado tags regex again:
>
> https://avocado-framework.readthedocs.io/en/68.0/api/core/avocado.core.html#avocado.core.safeloader.DOCSTRING_DIRECTIVE_RE_RAW
>
> I'll look into raising that limitation on the next release, but,
> for the time being, this will need to be:
>
> :avocado: tags=machine:next_cube
>
> The same applies to the other tests, of course.
OK, since there are no warnings, I did not notice.
>> + :avocado: tags=device:framebuffer
>> + """
>> + screenshot_path = self.workdir + "dump"
>
> Best practice is to use os.path.join() instead.
OK.
>> + self.check_bootrom_framebuffer(screenshot_path)
>> +
>> + width, height = Image.open(screenshot_path).size
>> + self.assertEqual(width, 1120)
>> + self.assertEqual(height, 832)
>> +
>> + @skipUnless(tesseract_available(3), 'tesseract v3 OCR tool not available')
>> + def test_bootrom_framebuffer_ocr_with_tesseract_v3(self):
>> + """
>> + :avocado: tags=arch:m68k
>> + :avocado: tags=machine:next-cube
>> + :avocado: tags=device:framebuffer
>> + """
>> + screenshot_path = self.workdir + "dump"
>> + self.check_bootrom_framebuffer(screenshot_path)
>> +
>> + console_logger = logging.getLogger('ocr')
>> + text = process.run("tesseract %s stdout" % screenshot_path).stdout_text
>> + console_logger.debug(text)
>> + self.assertIn('Backplane', text)
>> + self.assertIn('Ethernet address', text)
>
> I haven't tried v3, but I'm curious... is this about the change in
> command line syntax only? Or do v3 and v4 are able to recognize
> different characters?
Yes, they use different engine.
In short:
"Tesseract 4 adds a new OCR engine based on LSTM neural networks. The
new version is faster and more accurate than version 3. The drawback is
that it is still alpha-level software."
[https://stackoverflow.com/questions/48498465/difference-between-tesseract-3-and-tesseract-4]
Thanks for your review!
> - Cleber.
>
>> +
>> + @skipUnless(tesseract_available(4), 'tesseract v4 OCR tool not available')
>> + def test_bootrom_framebuffer_ocr_with_tesseract_v4(self):
>> + """
>> + :avocado: tags=arch:m68k
>> + :avocado: tags=machine:next-cube
>> + :avocado: tags=device:framebuffer
>> + """
>> + screenshot_path = self.workdir + "dump"
>> + self.check_bootrom_framebuffer(screenshot_path)
>> +
>> + console_logger = logging.getLogger('ocr')
>> + proc = process.run("tesseract --oem 1 %s stdout" % screenshot_path)
>> + text = proc.stdout_text
>> + console_logger.debug(text)
>> + self.assertIn('Testing the FPU, SCC', text)
>> + self.assertIn('System test failed. Error code 51', text)
>> + self.assertIn('Boot command', text)
>> + self.assertIn('Next>', text)
>> --
>> 2.19.1
>>
© 2016 - 2026 Red Hat, Inc.