[Qemu-devel] [RFC PATCH v2] coccinelle: add a script to optimize tcg op using tcg_gen_extract()

Philippe Mathieu-Daudé posted 1 patch 6 years, 11 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20170510235204.9657-1-f4bug@amsat.org
Test checkpatch passed
Test docker passed
Test s390x passed
scripts/coccinelle/tcg_gen_extract.cocci | 61 ++++++++++++++++++++++++++++++++
1 file changed, 61 insertions(+)
create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci
[Qemu-devel] [RFC PATCH v2] coccinelle: add a script to optimize tcg op using tcg_gen_extract()
Posted by Philippe Mathieu-Daudé 6 years, 11 months ago
Apply this script using:

$ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
    --sp-file scripts/coccinelle/tcg_gen_extract.cocci \
    --macro-file scripts/cocci-macro-file.h \
    --dir target \
    --in-place

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---

This is a new version of the coccinelle script addressing Richard comments and
trying to do it correctly. Also changed license to GPLv2+.

The first rule matches, it calls a python2 script that basically checks the
target_ulong is not overflowed: (msk << ofs) >> sizeof(target_ulong) == 0 

I think this is enough but if I'm wrong I can try to do as Richard suggested
(counting the low bits).

If this rule fails, there is no further processing, else the 3rd rule doing the
patch is called.
I couldn't figure still how to reuse args from rule2 in rule3 so it is mostly a
copy of rule1.

I also don't know yet how to process target-generic functions (ending _tl
instead of _32 or _64) eventually the correct --include-option may resolve this.

Applied it verified the ARM/M68K/PPC patches (2,3,5,7 from serie v1).
The other archs weren't patched since they use _tl generic functions.

I apologize for my python!

Regards,

Phil.

 scripts/coccinelle/tcg_gen_extract.cocci | 61 ++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)
 create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci

diff --git a/scripts/coccinelle/tcg_gen_extract.cocci b/scripts/coccinelle/tcg_gen_extract.cocci
new file mode 100644
index 0000000000..d75ef7d98c
--- /dev/null
+++ b/scripts/coccinelle/tcg_gen_extract.cocci
@@ -0,0 +1,61 @@
+// optimize TCG using extract op
+//
+// Copyright: (C) 2017 Philippe Mathieu-Daudé. GPLv2+.
+// Confidence: Low
+// Options: --macro-file scripts/cocci-macro-file.h
+
+@match@ // match shri*+andi* pattern, calls script check_len_correct
+identifier ret, arg;
+constant ofs, len;
+identifier tcg_gen_shri =~ "^tcg_gen_shri_";
+identifier tcg_gen_andi =~ "^tcg_gen_andi_";
+position p;
+@@
+(
+tcg_gen_shri(ret, arg, ofs);
+tcg_gen_andi(ret, ret, len);@p
+)
+
+@script:python check_len_correct@
+ret_s << match.ret;
+arg_s << match.arg;
+ofs_s << match.ofs;
+msk_s << match.len;
+shr_s << match.tcg_gen_shri;
+and_s << match.tcg_gen_andi;
+@@
+try:
+    # only eval integer, no #define like 'SR_M'
+    ofs = long(ofs_s, 0)
+    msk = long(msk_s, 0)
+    # get op_size: 32/64
+    shr_sz = int(shr_s[-2:])
+    and_sz = int(and_s[-2:])
+    # op_size shr<and allowed
+    # check overflow
+    if shr_sz > and_sz or (msk << ofs) >> and_sz:
+        cocci.include_match(False) # no further process
+except ValueError: # op_size: "tl" not handled yet
+    cocci.include_match(False) # no further process
+
+@use_extract depends on check_len_correct@ // ofs/len are valid, we can replace
+identifier ret, arg;
+constant ofs, len;
+@@
+(
+// from Nikunj A Dadhania comment:
+// http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05211.html
+-tcg_gen_shri_tl(ret, arg, ofs);
+-tcg_gen_andi_tl(ret, ret, len);
++tcg_gen_extract_tl(ret, arg, ofs, len);
+|
+// from Aurelien Jarno comment:
+// http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01466.html
+-tcg_gen_shri_i32(ret, arg, ofs);
+-tcg_gen_andi_i32(ret, ret, len);
++tcg_gen_extract_i32(ret, arg, ofs, len);
+|
+-tcg_gen_shri_i64(ret, arg, ofs);
+-tcg_gen_andi_i64(ret, ret, len);
++tcg_gen_extract_i64(ret, arg, ofs, len);
+)
-- 
2.11.0


Re: [Qemu-devel] [RFC PATCH v2] coccinelle: add a script to optimize tcg op using tcg_gen_extract()
Posted by Philippe Mathieu-Daudé 6 years, 11 months ago
Ok I just understood Richard explanation, so this patch is WRONG and I 
need to get some real rest :(

On 05/10/2017 08:52 PM, Philippe Mathieu-Daudé wrote:
> Apply this script using:
>
> $ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
>     --sp-file scripts/coccinelle/tcg_gen_extract.cocci \
>     --macro-file scripts/cocci-macro-file.h \
>     --dir target \
>     --in-place
>
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
>
> This is a new version of the coccinelle script addressing Richard comments and
> trying to do it correctly. Also changed license to GPLv2+.
>
> The first rule matches, it calls a python2 script that basically checks the
> target_ulong is not overflowed: (msk << ofs) >> sizeof(target_ulong) == 0

WRONG

for ex tcg_gen_extract_i32():

     switch (len) {
     default:
         tcg_gen_shli_i32(ret, arg, 32 - len - ofs);
         tcg_gen_shri_i32(ret, ret, 32 - len);

'len' is not a mask as I used to think but is really a length.

> I think this is enough but if I'm wrong I can try to do as Richard suggested
> (counting the low bits).

Now I got it Richard!

> If this rule fails, there is no further processing, else the 3rd rule doing the
> patch is called.
> I couldn't figure still how to reuse args from rule2 in rule3 so it is mostly a
> copy of rule1.
>
> I also don't know yet how to process target-generic functions (ending _tl
> instead of _32 or _64) eventually the correct --include-option may resolve this.
>
> Applied it verified the ARM/M68K/PPC patches (2,3,5,7 from serie v1).
> The other archs weren't patched since they use _tl generic functions.
>
> I apologize for my python!
>
> Regards,
>
> Phil.
>
>  scripts/coccinelle/tcg_gen_extract.cocci | 61 ++++++++++++++++++++++++++++++++
>  1 file changed, 61 insertions(+)
>  create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci
>
> diff --git a/scripts/coccinelle/tcg_gen_extract.cocci b/scripts/coccinelle/tcg_gen_extract.cocci
> new file mode 100644
> index 0000000000..d75ef7d98c
> --- /dev/null
> +++ b/scripts/coccinelle/tcg_gen_extract.cocci
> @@ -0,0 +1,61 @@
> +// optimize TCG using extract op
> +//
> +// Copyright: (C) 2017 Philippe Mathieu-Daudé. GPLv2+.
> +// Confidence: Low
> +// Options: --macro-file scripts/cocci-macro-file.h
> +
> +@match@ // match shri*+andi* pattern, calls script check_len_correct
> +identifier ret, arg;
> +constant ofs, len;
> +identifier tcg_gen_shri =~ "^tcg_gen_shri_";
> +identifier tcg_gen_andi =~ "^tcg_gen_andi_";
> +position p;
> +@@
> +(
> +tcg_gen_shri(ret, arg, ofs);
> +tcg_gen_andi(ret, ret, len);@p
> +)
> +
> +@script:python check_len_correct@
> +ret_s << match.ret;
> +arg_s << match.arg;
> +ofs_s << match.ofs;
> +msk_s << match.len;
> +shr_s << match.tcg_gen_shri;
> +and_s << match.tcg_gen_andi;
> +@@
> +try:
> +    # only eval integer, no #define like 'SR_M'
> +    ofs = long(ofs_s, 0)
> +    msk = long(msk_s, 0)
> +    # get op_size: 32/64
> +    shr_sz = int(shr_s[-2:])
> +    and_sz = int(and_s[-2:])
> +    # op_size shr<and allowed
> +    # check overflow
> +    if shr_sz > and_sz or (msk << ofs) >> and_sz:

WRONG

> +        cocci.include_match(False) # no further process
> +except ValueError: # op_size: "tl" not handled yet
> +    cocci.include_match(False) # no further process
> +
> +@use_extract depends on check_len_correct@ // ofs/len are valid, we can replace
> +identifier ret, arg;
> +constant ofs, len;
> +@@
> +(
> +// from Nikunj A Dadhania comment:
> +// http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05211.html
> +-tcg_gen_shri_tl(ret, arg, ofs);
> +-tcg_gen_andi_tl(ret, ret, len);
> ++tcg_gen_extract_tl(ret, arg, ofs, len);
> +|
> +// from Aurelien Jarno comment:
> +// http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01466.html
> +-tcg_gen_shri_i32(ret, arg, ofs);
> +-tcg_gen_andi_i32(ret, ret, len);
> ++tcg_gen_extract_i32(ret, arg, ofs, len);
> +|
> +-tcg_gen_shri_i64(ret, arg, ofs);
> +-tcg_gen_andi_i64(ret, ret, len);
> ++tcg_gen_extract_i64(ret, arg, ofs, len);
> +)
>