[PATCH v7 1/7] fuzz: accelerate non-crash detection

Qiuhao Li posted 7 patches 5 years, 1 month ago
Maintainers: Alexander Bulekov <alxndr@bu.edu>, Paolo Bonzini <pbonzini@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Thomas Huth <thuth@redhat.com>, Bandan Das <bsd@redhat.com>
There is a newer version of this series
[PATCH v7 1/7] fuzz: accelerate non-crash detection
Posted by Qiuhao Li 5 years, 1 month ago
We spend much time waiting for the timeout program during the minimization
process until it passes a time limit. This patch hacks the CLOSED (indicates
the redirection file closed) notification in QTest's output if it doesn't
crash.

Test with quadrupled trace input at:
  https://bugs.launchpad.net/qemu/+bug/1890333/comments/1

Original version:
  real	1m37.246s
  user	0m13.069s
  sys	0m8.399s

Refined version:
  real	0m45.904s
  user	0m16.874s
  sys	0m10.042s

Note:

Sometimes the mutated or the same trace may trigger a different crash
summary (second-to-last line) but indicates the same bug. For example, Bug
1910826 [1], which will trigger a stack overflow, may output summaries
like:

SUMMARY: AddressSanitizer: stack-overflow
/home/qiuhao/hack/qemu/build/../softmmu/physmem.c:488 in
flatview_do_translate

or

SUMMARY: AddressSanitizer: stack-overflow
(/home/qiuhao/hack/qemu/build/qemu-system-i386+0x27ca049) in __asan_memcpy

Etc.

If we use the whole summary line as the token, we may be prevented from
further minimization. So in this patch, we only use the first three words
which indicate the type of crash:

SUMMARY: AddressSanitizer: stack-overflow

[1] https://bugs.launchpad.net/qemu/+bug/1910826

Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 43 +++++++++++++++++-------
 1 file changed, 31 insertions(+), 12 deletions(-)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
index 5e405a0d5f..97f1201747 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -29,8 +29,14 @@ whether the crash occred. Optionally, manually set a string that idenitifes the
 crash by setting CRASH_TOKEN=
 """.format((sys.argv[0])))
 
+deduplication_note = """\n\
+Note: While trimming the input, sometimes the mutated trace triggers a different
+type crash but indicates the same bug. Under this situation, our minimizer is
+incapable of recognizing and stopped from removing it. In the future, we may
+use a more sophisticated crash case deduplication method.
+\n"""
+
 def check_if_trace_crashes(trace, path):
-    global CRASH_TOKEN
     with open(path, "w") as tracefile:
         tracefile.write("".join(trace))
 
@@ -41,18 +47,32 @@ def check_if_trace_crashes(trace, path):
                            trace_path=path),
                           shell=True,
                           stdin=subprocess.PIPE,
-                          stdout=subprocess.PIPE)
-    stdo = rc.communicate()[0]
-    output = stdo.decode('unicode_escape')
-    if rc.returncode == 137:    # Timed Out
-        return False
-    if len(output.splitlines()) < 2:
-        return False
-
+                          stdout=subprocess.PIPE,
+                          encoding="utf-8")
+    global CRASH_TOKEN
     if CRASH_TOKEN is None:
-        CRASH_TOKEN = output.splitlines()[-2]
+        try:
+            outs, _ = rc.communicate(timeout=5)
+            CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])
+        except subprocess.TimeoutExpired:
+            print("subprocess.TimeoutExpired")
+            return False
+        print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
+        global deduplication_note
+        print(deduplication_note)
+        return True
 
-    return CRASH_TOKEN in output
+    for line in iter(rc.stdout.readline, b''):
+        if "CLOSED" in line:
+            return False
+        if CRASH_TOKEN in line:
+            return True
+        # We reach the end of stdout and there is no "CLOSED" or CRASH_TOKEN
+        # Usually this is caused by a different type of crash
+        if line == "":
+            return False
+
+    return False
 
 
 def minimize_trace(inpath, outpath):
@@ -66,7 +86,6 @@ def minimize_trace(inpath, outpath):
     print("Crashed in {} seconds".format(end-start))
     TIMEOUT = (end-start)*5
     print("Setting the timeout for {} seconds".format(TIMEOUT))
-    print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
 
     i = 0
     newtrace = trace[:]
-- 
2.25.1


Re: [PATCH v7 1/7] fuzz: accelerate non-crash detection
Posted by Alexander Bulekov 5 years ago
On 210110 2119, Qiuhao Li wrote:
> We spend much time waiting for the timeout program during the minimization
> process until it passes a time limit. This patch hacks the CLOSED (indicates
> the redirection file closed) notification in QTest's output if it doesn't
> crash.
> 
> Test with quadrupled trace input at:
>   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> 
> Original version:
>   real	1m37.246s
>   user	0m13.069s
>   sys	0m8.399s
> 
> Refined version:
>   real	0m45.904s
>   user	0m16.874s
>   sys	0m10.042s
> 
> Note:
> 
> Sometimes the mutated or the same trace may trigger a different crash
> summary (second-to-last line) but indicates the same bug. For example, Bug
> 1910826 [1], which will trigger a stack overflow, may output summaries
> like:
> 
> SUMMARY: AddressSanitizer: stack-overflow
> /home/qiuhao/hack/qemu/build/../softmmu/physmem.c:488 in
> flatview_do_translate
> 
> or
> 
> SUMMARY: AddressSanitizer: stack-overflow
> (/home/qiuhao/hack/qemu/build/qemu-system-i386+0x27ca049) in __asan_memcpy
> 
> Etc.
> 
> If we use the whole summary line as the token, we may be prevented from
> further minimization. So in this patch, we only use the first three words
> which indicate the type of crash:
> 
> SUMMARY: AddressSanitizer: stack-overflow
> 
> [1] https://bugs.launchpad.net/qemu/+bug/1910826
> 
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>

Reviewed-by: Alexander Bulekov <alxndr@bu.edu>

Thanks

> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 43 +++++++++++++++++-------
>  1 file changed, 31 insertions(+), 12 deletions(-)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 5e405a0d5f..97f1201747 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -29,8 +29,14 @@ whether the crash occred. Optionally, manually set a string that idenitifes the
>  crash by setting CRASH_TOKEN=
>  """.format((sys.argv[0])))
>  
> +deduplication_note = """\n\
> +Note: While trimming the input, sometimes the mutated trace triggers a different
> +type crash but indicates the same bug. Under this situation, our minimizer is
> +incapable of recognizing and stopped from removing it. In the future, we may
> +use a more sophisticated crash case deduplication method.
> +\n"""
> +
>  def check_if_trace_crashes(trace, path):
> -    global CRASH_TOKEN
>      with open(path, "w") as tracefile:
>          tracefile.write("".join(trace))
>  
> @@ -41,18 +47,32 @@ def check_if_trace_crashes(trace, path):
>                             trace_path=path),
>                            shell=True,
>                            stdin=subprocess.PIPE,
> -                          stdout=subprocess.PIPE)
> -    stdo = rc.communicate()[0]
> -    output = stdo.decode('unicode_escape')
> -    if rc.returncode == 137:    # Timed Out
> -        return False
> -    if len(output.splitlines()) < 2:
> -        return False
> -
> +                          stdout=subprocess.PIPE,
> +                          encoding="utf-8")
> +    global CRASH_TOKEN
>      if CRASH_TOKEN is None:
> -        CRASH_TOKEN = output.splitlines()[-2]
> +        try:
> +            outs, _ = rc.communicate(timeout=5)
> +            CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])
> +        except subprocess.TimeoutExpired:
> +            print("subprocess.TimeoutExpired")
> +            return False
> +        print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
> +        global deduplication_note
> +        print(deduplication_note)
> +        return True
>  
> -    return CRASH_TOKEN in output
> +    for line in iter(rc.stdout.readline, b''):
> +        if "CLOSED" in line:
> +            return False
> +        if CRASH_TOKEN in line:
> +            return True
> +        # We reach the end of stdout and there is no "CLOSED" or CRASH_TOKEN
> +        # Usually this is caused by a different type of crash
> +        if line == "":
> +            return False
> +
> +    return False
>  
>  
>  def minimize_trace(inpath, outpath):
> @@ -66,7 +86,6 @@ def minimize_trace(inpath, outpath):
>      print("Crashed in {} seconds".format(end-start))
>      TIMEOUT = (end-start)*5
>      print("Setting the timeout for {} seconds".format(TIMEOUT))
> -    print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
>  
>      i = 0
>      newtrace = trace[:]
> -- 
> 2.25.1
>