[RFC v2 00/10] Introduce an extensible static analyzer

Alberto Faria posted 10 patches 1 year, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220729130040.1428779-1-afaria@redhat.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Gerd Hoffmann <kraxel@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, John Snow <jsnow@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>, Eric Blake <eblake@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>, "Philippe Mathieu-Daudé" <f4bug@amsat.org>, "Denis V. Lunev" <den@openvz.org>, Alberto Garcia <berto@igalia.com>, "Richard W.M. Jones" <rjones@redhat.com>, Stefan Weil <sw@weilnetz.de>, Jeff Cody <codyprime@gmail.com>, Markus Armbruster <armbru@redhat.com>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Raphael Norwitz <raphael.norwitz@nutanix.com>, "Michael S. Tsirkin" <mst@redhat.com>, Greg Kurz <groug@kaod.org>, Christian Schoenebeck <qemu_oss@crudebyte.com>, "Alex Bennée" <alex.bennee@linaro.org>, Laurent Vivier <lvivier@redhat.com>, Amit Shah <amit@kernel.org>, "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>, Peter Xu <peterx@redhat.com>, Jason Wang <jasowang@redhat.com>, Eduardo Habkost <eduardo@habkost.net>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Dmitry Fleytman <dmitry.fleytman@gmail.com>, Keith Busch <kbusch@kernel.org>, Klaus Jensen <its@irrelevant.dk>, Hannes Reinecke <hare@suse.com>, Alex Williamson <alex.williamson@redhat.com>, Eric Auger <eric.auger@redhat.com>, David Hildenbrand <david@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, Juan Quintela <quintela@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Michael Roth <michael.roth@amd.com>, Konstantin Kostiuk <kkostiuk@redhat.com>, Thomas Huth <thuth@redhat.com>, Xie Yongji <xieyongji@bytedance.com>, Marcelo Tosatti <mtosatti@redhat.com>
accel/kvm/kvm-all.c                        |  12 +-
accel/tcg/plugin-gen.c                     |   9 +-
accel/tcg/translate-all.c                  |   9 +-
audio/audio.c                              |   5 +-
block.c                                    |   2 +-
block/backup.c                             |   2 +-
block/block-copy.c                         |   4 +-
block/commit.c                             |   2 +-
block/dirty-bitmap.c                       |   6 +-
block/file-posix.c                         |   6 +-
block/io.c                                 |  52 +-
block/mirror.c                             |   4 +-
block/monitor/block-hmp-cmds.c             |   2 +-
block/nvme.c                               |   3 +-
block/parallels.c                          |  28 +-
block/qcow.c                               |  10 +-
block/qcow2-bitmap.c                       |   6 +-
block/qcow2-snapshot.c                     |   6 +-
block/qcow2.c                              |  38 +-
block/qcow2.h                              |  14 +-
block/qed-table.c                          |   2 +-
block/qed.c                                |  14 +-
block/quorum.c                             |   7 +-
block/ssh.c                                |   6 +-
block/throttle-groups.c                    |   3 +-
block/vdi.c                                |  17 +-
block/vhdx.c                               |   8 +-
block/vmdk.c                               |  11 +-
block/vpc.c                                |   4 +-
block/vvfat.c                              |  11 +-
blockdev.c                                 |   2 +-
chardev/char-ringbuf.c                     |   4 +-
contrib/ivshmem-server/main.c              |   4 +-
contrib/vhost-user-blk/vhost-user-blk.c    |   5 +-
dump/dump.c                                |   4 +-
fsdev/virtfs-proxy-helper.c                |   3 +-
gdbstub.c                                  |  18 +-
hw/audio/intel-hda.c                       |   7 +-
hw/audio/pcspk.c                           |   7 +-
hw/char/virtio-serial-bus.c                |  14 +-
hw/display/cirrus_vga.c                    |   5 +-
hw/hyperv/vmbus.c                          |  10 +-
hw/i386/intel_iommu.c                      |  28 +-
hw/i386/pc_q35.c                           |   5 +-
hw/ide/pci.c                               |   4 +-
hw/net/rtl8139.c                           |   3 +-
hw/net/virtio-net.c                        |   6 +-
hw/net/vmxnet3.c                           |   3 +-
hw/nvme/ctrl.c                             |  17 +-
hw/nvram/fw_cfg.c                          |   3 +-
hw/scsi/megasas.c                          |   6 +-
hw/scsi/mptconfig.c                        |   7 +-
hw/scsi/mptsas.c                           |  14 +-
hw/scsi/scsi-bus.c                         |   6 +-
hw/usb/dev-audio.c                         |  13 +-
hw/usb/hcd-ehci.c                          |   6 +-
hw/usb/hcd-ohci.c                          |   4 +-
hw/usb/hcd-xhci.c                          |  56 +-
hw/vfio/common.c                           |  21 +-
hw/virtio/vhost-vdpa.c                     |   3 +-
hw/virtio/vhost.c                          |  11 +-
hw/virtio/virtio-iommu.c                   |   4 +-
hw/virtio/virtio-mem.c                     |   9 +-
include/block/block-common.h               |   2 +-
include/block/block-hmp-cmds.h             |   2 +-
include/block/block-io.h                   |   5 +-
include/block/block_int-common.h           |  12 +-
include/qemu/coroutine.h                   |  43 +-
io/channel-command.c                       |  10 +-
migration/migration.c                      |  12 +-
net/dump.c                                 |  16 +-
net/vhost-vdpa.c                           |   8 +-
qemu-img.c                                 |   6 +-
qga/commands-posix-ssh.c                   |  10 +-
softmmu/physmem.c                          |  18 +-
softmmu/qtest.c                            |   5 +-
static-analyzer.py                         | 801 +++++++++++++++++++++
static_analyzer/__init__.py                | 348 +++++++++
static_analyzer/coroutine_fn.py            | 280 +++++++
static_analyzer/no_coroutine_fn.py         | 111 +++
static_analyzer/return_value_never_used.py | 220 ++++++
subprojects/libvduse/libvduse.c            |  12 +-
subprojects/libvhost-user/libvhost-user.c  |  24 +-
target/i386/host-cpu.c                     |   3 +-
target/i386/kvm/kvm.c                      |  19 +-
tcg/optimize.c                             |   3 +-
tests/qtest/libqos/malloc.c                |   5 +-
tests/qtest/libqos/qgraph.c                |   3 +-
tests/qtest/test-x86-cpuid-compat.c        |   8 +-
tests/qtest/virtio-9p-test.c               |   6 +-
tests/unit/test-aio-multithread.c          |   5 +-
tests/vhost-user-bridge.c                  |  19 +-
ui/vnc.c                                   |  23 +-
util/aio-posix.c                           |   7 +-
util/uri.c                                 |  18 +-
95 files changed, 2160 insertions(+), 519 deletions(-)
create mode 100755 static-analyzer.py
create mode 100644 static_analyzer/__init__.py
create mode 100644 static_analyzer/coroutine_fn.py
create mode 100644 static_analyzer/no_coroutine_fn.py
create mode 100644 static_analyzer/return_value_never_used.py
[RFC v2 00/10] Introduce an extensible static analyzer
Posted by Alberto Faria 1 year, 9 months ago
This series introduces a static analyzer for QEMU. It consists of a
single static-analyzer.py script that relies on libclang's Python
bindings, and provides a common framework on which arbitrary static
analysis checks can be developed and run against QEMU's code base.

Summary of the series:

  - Patch 1 adds the base static analyzer, along with a simple check
    that finds static functions whose return value is never used, and
    patch 2 fixes many occurrences of this.

  - Patch 3 introduces support for output-comparison check tests, and
    adds some tests to the abovementioned check.

  - Patch 4 makes the analyzer skip checks on a translation unit when it
    hasn't been modified since the last time those checks passed.

  - Patch 5 adds a check to ensure that non-coroutine_fn functions don't
    perform direct calls to coroutine_fn functions, and patch 6 fixes
    some violations of this rule.

  - Patch 7 adds a check to ensure that operations on coroutine_fn
    pointers make sense, like assignment and indirect calls, and patch 8
    fixes some problems detected by the check. (Implementing this check
    properly is complicated, since AFAICT annotation attributes cannot
    be applied directly to types. This part still needs a lot of work.)

  - Patch 9 introduces a no_coroutine_fn marker for functions that
    should not be called from coroutines, makes generated_co_wrapper
    evaluate to no_coroutine_fn, and adds a check enforcing this rule.
    Patch 10 fixes some violations that it finds.

The current primary motivation for this work is enforcing rules around
block layer coroutines, which is why most of the series focuses on that.
However, the static analyzer is intended to be sufficiently generic to
satisfy other present and future QEMU static analysis needs.

Performance isn't great, but with some more optimization, the analyzer
should be fast enough to be used iteratively during development, given
that it avoids reanalyzing unmodified translation units, and that users
can restrict the set of translation units under consideration. It should
also be fast enough to run in CI (?).

Consider a small QEMU configuration and build (all commands were run on
the same 12-thread laptop):

    $ cd build && time ../configure --target-list=x86_64-softmmu && cd ..
    [...]

    real    0m17.232s
    user    0m13.261s
    sys     0m3.895s

    $ time make -C build -j $(nproc) all
    [...]

    real    2m39.029s
    user    14m49.370s
    sys     1m57.364s

    $ time make -C build -j $(nproc) check
    [...]

    real    2m46.349s
    user    6m4.718s
    sys     4m15.660s

We can run the static analyzer against all translation units enabled in
this configuration:

    $ time ./static-analyzer.py build
    util/qemu-coroutine.c:122:23: non-coroutine_fn function calls coroutine_fn qemu_coroutine_self()
    io/channel.c:152:17: non-coroutine_fn function calls coroutine_fn qio_channel_yield()
    [...]
    Analyzed 1649 translation units in 520.3 seconds.

    real    8m42.342s
    user    95m51.759s
    sys     0m21.576s

You will need libclang's Python bindings to run this. Try `dnf install
python3-clang` or `apt install python3-clang`.

It takes around 1 to 2 seconds for the analyzer to load the compilation
database, determine which translation units to analyze, etc. The
durations reported by the analyzer itself don't include those steps,
which is why they differ from what `time` reports.

We can also analyze only some of the translation units:

    $ time ./static-analyzer.py build block
    block/raw-format.c:420:12: non-coroutine_fn function calls coroutine_fn bdrv_co_ioctl()
    block/blkverify.c:266:12: non-coroutine_fn function calls coroutine_fn bdrv_co_flush()
    [...]
    Analyzed 21 translation units (58 other were up-to-date) in 5.8 seconds.

    real    0m7.031s
    user    0m40.951s
    sys     0m1.299s

Since the previous command had already analyzed all translation units,
only the ones that had problems were reanalyzed.

Now skipping all the actual checks, but still parsing and building the
AST for each translation unit, and adding --force to reanalyze all
translation units:

    $ time ./static-analyzer.py build --force --skip-checks
    Analyzed 1649 translation units in 41.2 seconds.

    real    0m42.296s
    user    7m14.256s
    sys     0m15.803s

And now running a single check:

    $ time ./static-analyzer.py build --force --check return-value-never-used
    Analyzed 1649 translation units in 157.6 seconds.

    real    2m38.759s
    user    29m28.930s
    sys     0m17.968s

TODO:
  - Run in GitLab CI (?).
  - Finish the "coroutine_fn" check.
  - Add check tests where missing.
  - Avoid redundant AST traversals while keeping checks modular.
  - More optimization.

v2:
  - Fix parsing of compilation database commands.
  - Reorganize checks and split them into separate modules.
  - Make "return-value-never-used" ignore __attribute__((unused)) funcs.
  - Add a visitor() abstraction wrapping clang_visitChildren() that is
    faster than using Cursor.get_children() with recursion.
  - Add support for implementing tests for checks, and add some tests to
    "return-value-never-used".
  - Use dependency information provided by Ninja to skip checks on
    translation units that haven't been modified since they last passed
    those checks.
  - Ignore translation units from git submodules.
  - And more.

Alberto Faria (10):
  Add an extensible static analyzer
  Drop unused static function return values
  static-analyzer: Support adding tests to checks
  static-analyzer: Avoid reanalyzing unmodified translation units
  static-analyzer: Enforce coroutine_fn restrictions for direct calls
  Fix some direct calls from non-coroutine_fn to coroutine_fn
  static-analyzer: Enforce coroutine_fn restrictions on function
    pointers
  Fix some bad coroutine_fn indirect calls and pointer assignments
  block: Add no_coroutine_fn marker
  Fix some calls from coroutine_fn to no_coroutine_fn

 accel/kvm/kvm-all.c                        |  12 +-
 accel/tcg/plugin-gen.c                     |   9 +-
 accel/tcg/translate-all.c                  |   9 +-
 audio/audio.c                              |   5 +-
 block.c                                    |   2 +-
 block/backup.c                             |   2 +-
 block/block-copy.c                         |   4 +-
 block/commit.c                             |   2 +-
 block/dirty-bitmap.c                       |   6 +-
 block/file-posix.c                         |   6 +-
 block/io.c                                 |  52 +-
 block/mirror.c                             |   4 +-
 block/monitor/block-hmp-cmds.c             |   2 +-
 block/nvme.c                               |   3 +-
 block/parallels.c                          |  28 +-
 block/qcow.c                               |  10 +-
 block/qcow2-bitmap.c                       |   6 +-
 block/qcow2-snapshot.c                     |   6 +-
 block/qcow2.c                              |  38 +-
 block/qcow2.h                              |  14 +-
 block/qed-table.c                          |   2 +-
 block/qed.c                                |  14 +-
 block/quorum.c                             |   7 +-
 block/ssh.c                                |   6 +-
 block/throttle-groups.c                    |   3 +-
 block/vdi.c                                |  17 +-
 block/vhdx.c                               |   8 +-
 block/vmdk.c                               |  11 +-
 block/vpc.c                                |   4 +-
 block/vvfat.c                              |  11 +-
 blockdev.c                                 |   2 +-
 chardev/char-ringbuf.c                     |   4 +-
 contrib/ivshmem-server/main.c              |   4 +-
 contrib/vhost-user-blk/vhost-user-blk.c    |   5 +-
 dump/dump.c                                |   4 +-
 fsdev/virtfs-proxy-helper.c                |   3 +-
 gdbstub.c                                  |  18 +-
 hw/audio/intel-hda.c                       |   7 +-
 hw/audio/pcspk.c                           |   7 +-
 hw/char/virtio-serial-bus.c                |  14 +-
 hw/display/cirrus_vga.c                    |   5 +-
 hw/hyperv/vmbus.c                          |  10 +-
 hw/i386/intel_iommu.c                      |  28 +-
 hw/i386/pc_q35.c                           |   5 +-
 hw/ide/pci.c                               |   4 +-
 hw/net/rtl8139.c                           |   3 +-
 hw/net/virtio-net.c                        |   6 +-
 hw/net/vmxnet3.c                           |   3 +-
 hw/nvme/ctrl.c                             |  17 +-
 hw/nvram/fw_cfg.c                          |   3 +-
 hw/scsi/megasas.c                          |   6 +-
 hw/scsi/mptconfig.c                        |   7 +-
 hw/scsi/mptsas.c                           |  14 +-
 hw/scsi/scsi-bus.c                         |   6 +-
 hw/usb/dev-audio.c                         |  13 +-
 hw/usb/hcd-ehci.c                          |   6 +-
 hw/usb/hcd-ohci.c                          |   4 +-
 hw/usb/hcd-xhci.c                          |  56 +-
 hw/vfio/common.c                           |  21 +-
 hw/virtio/vhost-vdpa.c                     |   3 +-
 hw/virtio/vhost.c                          |  11 +-
 hw/virtio/virtio-iommu.c                   |   4 +-
 hw/virtio/virtio-mem.c                     |   9 +-
 include/block/block-common.h               |   2 +-
 include/block/block-hmp-cmds.h             |   2 +-
 include/block/block-io.h                   |   5 +-
 include/block/block_int-common.h           |  12 +-
 include/qemu/coroutine.h                   |  43 +-
 io/channel-command.c                       |  10 +-
 migration/migration.c                      |  12 +-
 net/dump.c                                 |  16 +-
 net/vhost-vdpa.c                           |   8 +-
 qemu-img.c                                 |   6 +-
 qga/commands-posix-ssh.c                   |  10 +-
 softmmu/physmem.c                          |  18 +-
 softmmu/qtest.c                            |   5 +-
 static-analyzer.py                         | 801 +++++++++++++++++++++
 static_analyzer/__init__.py                | 348 +++++++++
 static_analyzer/coroutine_fn.py            | 280 +++++++
 static_analyzer/no_coroutine_fn.py         | 111 +++
 static_analyzer/return_value_never_used.py | 220 ++++++
 subprojects/libvduse/libvduse.c            |  12 +-
 subprojects/libvhost-user/libvhost-user.c  |  24 +-
 target/i386/host-cpu.c                     |   3 +-
 target/i386/kvm/kvm.c                      |  19 +-
 tcg/optimize.c                             |   3 +-
 tests/qtest/libqos/malloc.c                |   5 +-
 tests/qtest/libqos/qgraph.c                |   3 +-
 tests/qtest/test-x86-cpuid-compat.c        |   8 +-
 tests/qtest/virtio-9p-test.c               |   6 +-
 tests/unit/test-aio-multithread.c          |   5 +-
 tests/vhost-user-bridge.c                  |  19 +-
 ui/vnc.c                                   |  23 +-
 util/aio-posix.c                           |   7 +-
 util/uri.c                                 |  18 +-
 95 files changed, 2160 insertions(+), 519 deletions(-)
 create mode 100755 static-analyzer.py
 create mode 100644 static_analyzer/__init__.py
 create mode 100644 static_analyzer/coroutine_fn.py
 create mode 100644 static_analyzer/no_coroutine_fn.py
 create mode 100644 static_analyzer/return_value_never_used.py

-- 
2.37.1
Re: [RFC v2 00/10] Introduce an extensible static analyzer
Posted by Paolo Bonzini 1 year, 6 months ago
On Fri, Jul 29, 2022 at 3:01 PM Alberto Faria <afaria@redhat.com> wrote:
> Performance isn't great, but with some more optimization, the analyzer
> should be fast enough to be used iteratively during development, given
> that it avoids reanalyzing unmodified translation units, and that users
> can restrict the set of translation units under consideration. It should
> also be fast enough to run in CI (?).

I took a look again today, and the results are indeed very nice (I
sent a patch series with the code changes from this one).

The performance is not great as you point out. :/  I made a couple
attempts at optimizing it, for example the "actual_visitor" can be
written in a more efficient way like this, to avoid the stack:

    @CFUNCTYPE(c_int, Cursor, Cursor, py_object)
    def actual_visitor(node: Cursor, parent: Cursor, client_data:
Cursor) -> int:

        try:
            node.parent = client_data

            # several clang.cindex methods need Cursor._tu to be set
            node._tu = client_data._tu
            r = visitor(node)
            if r is VisitorResult.RECURSE:
                return 0 \
                    if conf.lib.clang_visitChildren(node,
actual_visitor, node) != 0 \
                    else 1
            else:
                return r.value

        except BaseException as e:
            # Exceptions can't cross into C. Stash it, abort the visitation, and
            # reraise it.
            if exception is None:
                exception = e

            return VisitorResult.BREAK.value

    root.parent = None
    result = conf.lib.clang_visitChildren(root, actual_visitor, root)

    if exception is not None:
        raise exception

    return result == 0

However, it seems like a lost battle. :( Some of the optimizations are
stuff that you should just not have to do, for example only invoking
"x.kind" once (because it's a property not a field). Another issue is
that the bindings are incomplete, for example if you have a ForStmt
you just get a Cursor and you are not able to access individual
expressions. As a result, this for example is wrong in the
return-value-never-used test:

                static int f(void) { return 42; }
                static void g(void) {
                    for (f(); ; ) { } /* should warn, it doesn't */
                }

and I couldn't fix it without breaking "for (; f(); )" because AFAICT
the two are indistinguishable.

On top of this, using libclang directly should make it possible to use
the Matcher API (the same one used by clang-match), instead of writing
everything by hand. It may not be that useful though in practice, but
it's a possibility.

Paolo
Re: [RFC v2 00/10] Introduce an extensible static analyzer
Posted by Christian Schoenebeck 1 year, 6 months ago
On Freitag, 14. Oktober 2022 00:00:58 CEST Paolo Bonzini wrote:
[...]
> However, it seems like a lost battle. :( Some of the optimizations are
> stuff that you should just not have to do, for example only invoking
> "x.kind" once (because it's a property not a field). Another issue is
> that the bindings are incomplete, for example if you have a ForStmt
> you just get a Cursor and you are not able to access individual
> expressions. As a result, this for example is wrong in the
> return-value-never-used test:
> 
>                 static int f(void) { return 42; }
>                 static void g(void) {
>                     for (f(); ; ) { } /* should warn, it doesn't */
>                 }
> 
> and I couldn't fix it without breaking "for (; f(); )" because AFAICT
> the two are indistinguishable.

You mean accessing the `for` loop's init expression, condition expression,
increment statement? AFAICS those are accessible already by calling
get_children() on the `for` statement cursor:
https://github.com/llvm/llvm-project/blob/main/clang/bindings/python/clang/cindex.py#L1833

I just made a quick test with a short .c file and their demo script:
https://github.com/llvm/llvm-project/blob/main/clang/bindings/python/examples/cindex/cindex-dump.py

And I got all those components of the `for` loop as children of the `for`
cursor.

Best regards,
Christian Schoenebeck
Re: [RFC v2 00/10] Introduce an extensible static analyzer
Posted by Marc-André Lureau 1 year, 8 months ago
Hi

On Fri, Jul 29, 2022 at 5:01 PM Alberto Faria <afaria@redhat.com> wrote:
>
> This series introduces a static analyzer for QEMU. It consists of a
> single static-analyzer.py script that relies on libclang's Python
> bindings, and provides a common framework on which arbitrary static
> analysis checks can be developed and run against QEMU's code base.
>
> Summary of the series:
>
>   - Patch 1 adds the base static analyzer, along with a simple check
>     that finds static functions whose return value is never used, and
>     patch 2 fixes many occurrences of this.
>
>   - Patch 3 introduces support for output-comparison check tests, and
>     adds some tests to the abovementioned check.
>
>   - Patch 4 makes the analyzer skip checks on a translation unit when it
>     hasn't been modified since the last time those checks passed.
>
>   - Patch 5 adds a check to ensure that non-coroutine_fn functions don't
>     perform direct calls to coroutine_fn functions, and patch 6 fixes
>     some violations of this rule.
>
>   - Patch 7 adds a check to ensure that operations on coroutine_fn
>     pointers make sense, like assignment and indirect calls, and patch 8
>     fixes some problems detected by the check. (Implementing this check
>     properly is complicated, since AFAICT annotation attributes cannot
>     be applied directly to types. This part still needs a lot of work.)
>
>   - Patch 9 introduces a no_coroutine_fn marker for functions that
>     should not be called from coroutines, makes generated_co_wrapper
>     evaluate to no_coroutine_fn, and adds a check enforcing this rule.
>     Patch 10 fixes some violations that it finds.
>
> The current primary motivation for this work is enforcing rules around
> block layer coroutines, which is why most of the series focuses on that.
> However, the static analyzer is intended to be sufficiently generic to
> satisfy other present and future QEMU static analysis needs.
>
> Performance isn't great, but with some more optimization, the analyzer
> should be fast enough to be used iteratively during development, given
> that it avoids reanalyzing unmodified translation units, and that users
> can restrict the set of translation units under consideration. It should
> also be fast enough to run in CI (?).
>
> Consider a small QEMU configuration and build (all commands were run on
> the same 12-thread laptop):
>
>     $ cd build && time ../configure --target-list=x86_64-softmmu && cd ..
>     [...]
>
>     real    0m17.232s
>     user    0m13.261s
>     sys     0m3.895s
>
>     $ time make -C build -j $(nproc) all
>     [...]
>
>     real    2m39.029s
>     user    14m49.370s
>     sys     1m57.364s
>
>     $ time make -C build -j $(nproc) check
>     [...]
>
>     real    2m46.349s
>     user    6m4.718s
>     sys     4m15.660s
>
> We can run the static analyzer against all translation units enabled in
> this configuration:
>
>     $ time ./static-analyzer.py build
>     util/qemu-coroutine.c:122:23: non-coroutine_fn function calls coroutine_fn qemu_coroutine_self()
>     io/channel.c:152:17: non-coroutine_fn function calls coroutine_fn qio_channel_yield()
>     [...]
>     Analyzed 1649 translation units in 520.3 seconds.
>
>     real    8m42.342s
>     user    95m51.759s
>     sys     0m21.576s
>
> You will need libclang's Python bindings to run this. Try `dnf install
> python3-clang` or `apt install python3-clang`.
>
> It takes around 1 to 2 seconds for the analyzer to load the compilation
> database, determine which translation units to analyze, etc. The
> durations reported by the analyzer itself don't include those steps,
> which is why they differ from what `time` reports.
>
> We can also analyze only some of the translation units:
>
>     $ time ./static-analyzer.py build block
>     block/raw-format.c:420:12: non-coroutine_fn function calls coroutine_fn bdrv_co_ioctl()
>     block/blkverify.c:266:12: non-coroutine_fn function calls coroutine_fn bdrv_co_flush()
>     [...]
>     Analyzed 21 translation units (58 other were up-to-date) in 5.8 seconds.
>
>     real    0m7.031s
>     user    0m40.951s
>     sys     0m1.299s
>
> Since the previous command had already analyzed all translation units,
> only the ones that had problems were reanalyzed.
>
> Now skipping all the actual checks, but still parsing and building the
> AST for each translation unit, and adding --force to reanalyze all
> translation units:
>
>     $ time ./static-analyzer.py build --force --skip-checks
>     Analyzed 1649 translation units in 41.2 seconds.
>
>     real    0m42.296s
>     user    7m14.256s
>     sys     0m15.803s
>
> And now running a single check:
>
>     $ time ./static-analyzer.py build --force --check return-value-never-used
>     Analyzed 1649 translation units in 157.6 seconds.
>
>     real    2m38.759s
>     user    29m28.930s
>     sys     0m17.968s
>
> TODO:
>   - Run in GitLab CI (?).
>   - Finish the "coroutine_fn" check.
>   - Add check tests where missing.
>   - Avoid redundant AST traversals while keeping checks modular.
>   - More optimization.

Great work so far! This seems easier to hack than my attempt to use
clang-tidy to write some qemu checks
(https://github.com/elmarco/clang-tools-extra)

The code seems quite generic, I wonder if such a tool in python wasn't
already developed (I couldn't find it easily searching on github).

Why not make it standalone from qemu? Similar to
https://gitlab.com/qemu-project/python-qemu-qmp, you could have your
own release management, issue tracker, code formatting, license, CI
etc. (you should add copyright header in each file, at least that's
pretty much required in qemu nowadays). You could also have the
qemu-specific checks there imho (clang-tidy has google & llvm specific
checks too)

It would be nice to write some docs, in docs/devel/testing.rst and
some new meson/ninja/make targets to run the checks directly from a
build tree.

On fc36, I had several dependencies I needed to install manually (imho
they should have been pulled by python3-clang), but more annoyingly I
got:
clang.cindex.LibclangError: libclang.so: cannot open shared object
file: No such file or directory. To provide a path to libclang use
Config.set_library_path() or Config.set_library_file().

clang-libs doesn't install libclang.so, I wonder why. I made a link
manually and it works, but it's probably incorrect. I'll try to open
issues for the clang packaging.

cheers


>
> v2:
>   - Fix parsing of compilation database commands.
>   - Reorganize checks and split them into separate modules.
>   - Make "return-value-never-used" ignore __attribute__((unused)) funcs.
>   - Add a visitor() abstraction wrapping clang_visitChildren() that is
>     faster than using Cursor.get_children() with recursion.
>   - Add support for implementing tests for checks, and add some tests to
>     "return-value-never-used".
>   - Use dependency information provided by Ninja to skip checks on
>     translation units that haven't been modified since they last passed
>     those checks.
>   - Ignore translation units from git submodules.
>   - And more.
>
> Alberto Faria (10):
>   Add an extensible static analyzer
>   Drop unused static function return values
>   static-analyzer: Support adding tests to checks
>   static-analyzer: Avoid reanalyzing unmodified translation units
>   static-analyzer: Enforce coroutine_fn restrictions for direct calls
>   Fix some direct calls from non-coroutine_fn to coroutine_fn
>   static-analyzer: Enforce coroutine_fn restrictions on function
>     pointers
>   Fix some bad coroutine_fn indirect calls and pointer assignments
>   block: Add no_coroutine_fn marker
>   Fix some calls from coroutine_fn to no_coroutine_fn
>
>  accel/kvm/kvm-all.c                        |  12 +-
>  accel/tcg/plugin-gen.c                     |   9 +-
>  accel/tcg/translate-all.c                  |   9 +-
>  audio/audio.c                              |   5 +-
>  block.c                                    |   2 +-
>  block/backup.c                             |   2 +-
>  block/block-copy.c                         |   4 +-
>  block/commit.c                             |   2 +-
>  block/dirty-bitmap.c                       |   6 +-
>  block/file-posix.c                         |   6 +-
>  block/io.c                                 |  52 +-
>  block/mirror.c                             |   4 +-
>  block/monitor/block-hmp-cmds.c             |   2 +-
>  block/nvme.c                               |   3 +-
>  block/parallels.c                          |  28 +-
>  block/qcow.c                               |  10 +-
>  block/qcow2-bitmap.c                       |   6 +-
>  block/qcow2-snapshot.c                     |   6 +-
>  block/qcow2.c                              |  38 +-
>  block/qcow2.h                              |  14 +-
>  block/qed-table.c                          |   2 +-
>  block/qed.c                                |  14 +-
>  block/quorum.c                             |   7 +-
>  block/ssh.c                                |   6 +-
>  block/throttle-groups.c                    |   3 +-
>  block/vdi.c                                |  17 +-
>  block/vhdx.c                               |   8 +-
>  block/vmdk.c                               |  11 +-
>  block/vpc.c                                |   4 +-
>  block/vvfat.c                              |  11 +-
>  blockdev.c                                 |   2 +-
>  chardev/char-ringbuf.c                     |   4 +-
>  contrib/ivshmem-server/main.c              |   4 +-
>  contrib/vhost-user-blk/vhost-user-blk.c    |   5 +-
>  dump/dump.c                                |   4 +-
>  fsdev/virtfs-proxy-helper.c                |   3 +-
>  gdbstub.c                                  |  18 +-
>  hw/audio/intel-hda.c                       |   7 +-
>  hw/audio/pcspk.c                           |   7 +-
>  hw/char/virtio-serial-bus.c                |  14 +-
>  hw/display/cirrus_vga.c                    |   5 +-
>  hw/hyperv/vmbus.c                          |  10 +-
>  hw/i386/intel_iommu.c                      |  28 +-
>  hw/i386/pc_q35.c                           |   5 +-
>  hw/ide/pci.c                               |   4 +-
>  hw/net/rtl8139.c                           |   3 +-
>  hw/net/virtio-net.c                        |   6 +-
>  hw/net/vmxnet3.c                           |   3 +-
>  hw/nvme/ctrl.c                             |  17 +-
>  hw/nvram/fw_cfg.c                          |   3 +-
>  hw/scsi/megasas.c                          |   6 +-
>  hw/scsi/mptconfig.c                        |   7 +-
>  hw/scsi/mptsas.c                           |  14 +-
>  hw/scsi/scsi-bus.c                         |   6 +-
>  hw/usb/dev-audio.c                         |  13 +-
>  hw/usb/hcd-ehci.c                          |   6 +-
>  hw/usb/hcd-ohci.c                          |   4 +-
>  hw/usb/hcd-xhci.c                          |  56 +-
>  hw/vfio/common.c                           |  21 +-
>  hw/virtio/vhost-vdpa.c                     |   3 +-
>  hw/virtio/vhost.c                          |  11 +-
>  hw/virtio/virtio-iommu.c                   |   4 +-
>  hw/virtio/virtio-mem.c                     |   9 +-
>  include/block/block-common.h               |   2 +-
>  include/block/block-hmp-cmds.h             |   2 +-
>  include/block/block-io.h                   |   5 +-
>  include/block/block_int-common.h           |  12 +-
>  include/qemu/coroutine.h                   |  43 +-
>  io/channel-command.c                       |  10 +-
>  migration/migration.c                      |  12 +-
>  net/dump.c                                 |  16 +-
>  net/vhost-vdpa.c                           |   8 +-
>  qemu-img.c                                 |   6 +-
>  qga/commands-posix-ssh.c                   |  10 +-
>  softmmu/physmem.c                          |  18 +-
>  softmmu/qtest.c                            |   5 +-
>  static-analyzer.py                         | 801 +++++++++++++++++++++
>  static_analyzer/__init__.py                | 348 +++++++++
>  static_analyzer/coroutine_fn.py            | 280 +++++++
>  static_analyzer/no_coroutine_fn.py         | 111 +++
>  static_analyzer/return_value_never_used.py | 220 ++++++
>  subprojects/libvduse/libvduse.c            |  12 +-
>  subprojects/libvhost-user/libvhost-user.c  |  24 +-
>  target/i386/host-cpu.c                     |   3 +-
>  target/i386/kvm/kvm.c                      |  19 +-
>  tcg/optimize.c                             |   3 +-
>  tests/qtest/libqos/malloc.c                |   5 +-
>  tests/qtest/libqos/qgraph.c                |   3 +-
>  tests/qtest/test-x86-cpuid-compat.c        |   8 +-
>  tests/qtest/virtio-9p-test.c               |   6 +-
>  tests/unit/test-aio-multithread.c          |   5 +-
>  tests/vhost-user-bridge.c                  |  19 +-
>  ui/vnc.c                                   |  23 +-
>  util/aio-posix.c                           |   7 +-
>  util/uri.c                                 |  18 +-
>  95 files changed, 2160 insertions(+), 519 deletions(-)
>  create mode 100755 static-analyzer.py
>  create mode 100644 static_analyzer/__init__.py
>  create mode 100644 static_analyzer/coroutine_fn.py
>  create mode 100644 static_analyzer/no_coroutine_fn.py
>  create mode 100644 static_analyzer/return_value_never_used.py
>
> --
> 2.37.1
>
Re: [RFC v2 00/10] Introduce an extensible static analyzer
Posted by Alberto Faria 1 year, 8 months ago
On Thu, Aug 4, 2022 at 12:44 PM Marc-André Lureau
<marcandre.lureau@redhat.com> wrote:
> Hi
>
> Great work so far! This seems easier to hack than my attempt to use
> clang-tidy to write some qemu checks
> (https://github.com/elmarco/clang-tools-extra)
>
> The code seems quite generic, I wonder if such a tool in python wasn't
> already developed (I couldn't find it easily searching on github).
>
> Why not make it standalone from qemu? Similar to
> https://gitlab.com/qemu-project/python-qemu-qmp, you could have your
> own release management, issue tracker, code formatting, license, CI
> etc. (you should add copyright header in each file, at least that's
> pretty much required in qemu nowadays). You could also have the
> qemu-specific checks there imho (clang-tidy has google & llvm specific
> checks too)

This is an interesting idea. Indeed, the analyzer is essentially a
more easily extensible, Python version of clang-tidy (without the big
built-in library of checks).

I think I'll continue working on it embedded in QEMU for now, mostly
because it depends on some aspects of the build system, and gradually
generalize it to a point where it could be made into a standalone
thing.

> It would be nice to write some docs, in docs/devel/testing.rst and
> some new meson/ninja/make targets to run the checks directly from a
> build tree.

Sounds good, I'll work on it.

> On fc36, I had several dependencies I needed to install manually (imho
> they should have been pulled by python3-clang), but more annoyingly I
> got:
> clang.cindex.LibclangError: libclang.so: cannot open shared object
> file: No such file or directory. To provide a path to libclang use
> Config.set_library_path() or Config.set_library_file().
>
> clang-libs doesn't install libclang.so, I wonder why. I made a link
> manually and it works, but it's probably incorrect. I'll try to open
> issues for the clang packaging.

That's strange. Thanks for looking into this.

Alberto
Re: [RFC v2 00/10] Introduce an extensible static analyzer
Posted by Marc-André Lureau 1 year, 8 months ago
Hi

On Fri, Aug 12, 2022 at 7:49 PM Alberto Faria <afaria@redhat.com> wrote:
>
> On Thu, Aug 4, 2022 at 12:44 PM Marc-André Lureau
> <marcandre.lureau@redhat.com> wrote:
> > On fc36, I had several dependencies I needed to install manually (imho
> > they should have been pulled by python3-clang), but more annoyingly I
> > got:
> > clang.cindex.LibclangError: libclang.so: cannot open shared object
> > file: No such file or directory. To provide a path to libclang use
> > Config.set_library_path() or Config.set_library_file().
> >
> > clang-libs doesn't install libclang.so, I wonder why. I made a link
> > manually and it works, but it's probably incorrect. I'll try to open
> > issues for the clang packaging.
>
> That's strange. Thanks for looking into this.

No that's normal, I just got confused. clang-devel provides it.

However, I opened https://bugzilla.redhat.com/show_bug.cgi?id=2115362
"python3-clang depends on libclang.so"