[libvirt PATCH 00/10] [RFC] clang-tidy CI integration

Tim Wiederhake posted 10 patches 3 years, 1 month ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20210212132534.29066-1-twiederh@redhat.com
.gitlab-ci.yml            |  88 ++++++
scripts/run-clang-tidy.py | 557 ++++++++++++++++++++++++++++++++++++++
2 files changed, 645 insertions(+)
create mode 100755 scripts/run-clang-tidy.py
[libvirt PATCH 00/10] [RFC] clang-tidy CI integration
Posted by Tim Wiederhake 3 years, 1 month ago
"clang-tidy" is a static code analysis tool for c and c++ code. See
https://listman.redhat.com/archives/libvir-list/2021-January/msg01152.html
for some issues found by clang-tidy and more background information.

Meson has support for clang-tidy and generates target named "clang-tidy" if
it finds a ".clang-tidy" configuration file in the project's root directory.

There are some problems with this approach though, with regards to inclusion
in GitLab's CI:

* Single-threaded runtime of a complete scan takes between 95 and 110 minutes,
depending on the enabled checks, which is significantly longer than GitLab's
maximum run time of 60 minutes, after which jobs are aborted.

* Even without this limit on runtime, this new check would double to triple
the run time of the libVirt pipeline in GitLab.

* clang-tidy finds a lot of false positives (see link above for explanation)
and has checks that contradict the libVirt code style (e.g. braces around
blocks). This makes a quite complex configuration in ".clang-tidy" neccessary.

* I was unable to make clang-tidy recognize the settings from the
configuration file for generated files, leading clang-tidy to always add some
checks. These checks were among those that produced false positives.

* The list of enabled / disabled checks in the yaml configuration file is a
quite long string, making it hard to weave in some comments / documentation
on which checks are enabled / disabled for what reason.

This series introduces a new script, "run-clang-tidy.py". This is a
replacement for the script of the same name from clang-tools-extra. It offers
parallel execution, caching of results and a configurable soft-timeout.

Please see the individual commits for more details. Comments welcome.

https://gitlab.com/twiederh/libvirt/-/pipelines/255321968 → "clang-tidy".

Tim

Tim Wiederhake (10):
  clang-tidy: Add a simple runner
  clang-tidy: Run in parallel
  clang-tidy: Filter output
  clang-tidy: Add cache
  clang-tidy: Add timeout
  clang-tidy: Allow timeouts
  clang-tidy: Add shuffle
  clang-tidy: Make list of checks explicit
  clang-tidy: Disable irrelevant and failing checks
  clang-tidy: Add CI integration

 .gitlab-ci.yml            |  88 ++++++
 scripts/run-clang-tidy.py | 557 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 645 insertions(+)
 create mode 100755 scripts/run-clang-tidy.py

-- 
2.26.2


Re: [libvirt PATCH 00/10] [RFC] clang-tidy CI integration
Posted by Daniel P. Berrangé 3 years, 1 month ago
On Fri, Feb 12, 2021 at 02:25:24PM +0100, Tim Wiederhake wrote:
> "clang-tidy" is a static code analysis tool for c and c++ code. See
> https://listman.redhat.com/archives/libvir-list/2021-January/msg01152.html
> for some issues found by clang-tidy and more background information.
> 
> Meson has support for clang-tidy and generates target named "clang-tidy" if
> it finds a ".clang-tidy" configuration file in the project's root directory.
> 
> There are some problems with this approach though, with regards to inclusion
> in GitLab's CI:
> 
> * Single-threaded runtime of a complete scan takes between 95 and 110 minutes,
> depending on the enabled checks, which is significantly longer than GitLab's
> maximum run time of 60 minutes, after which jobs are aborted.

IIUC, you can override the timeout setting per job...

> 
> * Even without this limit on runtime, this new check would double to triple
> the run time of the libVirt pipeline in GitLab.

Yeah that's a problem, but bear in mind there are two separate scenarios

Running post merge we need to check all files, but the length of time
is a non-issue since no one is waiting for a post-merge check to
complete.

Running pre- merge we only need to check files which are modified by
the patches on the current branch. That ought to be orders of magnitude
faster.

> 
> * clang-tidy finds a lot of false positives (see link above for explanation)
> and has checks that contradict the libVirt code style (e.g. braces around
> blocks). This makes a quite complex configuration in ".clang-tidy" neccessary.

IMHO for code checkers to add real value to people you need the existing
codebase in git to be 100% clean.

If you're not going to make the existing code compliant, then users will
never be sure of what they should do for new code. We see this all the
time in QEMU where its patch checker complains about problems that were
pre-existing.


If we could get clang-tidy to do a subset of checks where we can malke
libvirt 100% clean
 
> * I was unable to make clang-tidy recognize the settings from the
> configuration file for generated files, leading clang-tidy to always add some
> checks. These checks were among those that produced false positives.
> 
> * The list of enabled / disabled checks in the yaml configuration file is a
> quite long string, making it hard to weave in some comments / documentation
> on which checks are enabled / disabled for what reason.
> 
> This series introduces a new script, "run-clang-tidy.py". This is a
> replacement for the script of the same name from clang-tools-extra. It offers
> parallel execution, caching of results and a configurable soft-timeout.
> 
> Please see the individual commits for more details. Comments welcome.
> 
> https://gitlab.com/twiederh/libvirt/-/pipelines/255321968 → "clang-tidy".

What am I missing here - that job just seems to show a list of
filenames, but isn't reporting any issues for them ?


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [libvirt PATCH 00/10] [RFC] clang-tidy CI integration
Posted by Tim Wiederhake 3 years, 1 month ago
On Fri, 2021-02-12 at 14:12 +0000, Daniel P. Berrangé wrote:
> On Fri, Feb 12, 2021 at 02:25:24PM +0100, Tim Wiederhake wrote:
> > "clang-tidy" is a static code analysis tool for c and c++ code. See
> > https://listman.redhat.com/archives/libvir-list/2021-January/msg01152.html
> > for some issues found by clang-tidy and more background
> > information.
> > 
> > Meson has support for clang-tidy and generates target named "clang-
> > tidy" if
> > it finds a ".clang-tidy" configuration file in the project's root
> > directory.
> > 
> > There are some problems with this approach though, with regards to
> > inclusion
> > in GitLab's CI:
> > 
> > * Single-threaded runtime of a complete scan takes between 95 and
> > 110 minutes,
> > depending on the enabled checks, which is significantly longer than
> > GitLab's
> > maximum run time of 60 minutes, after which jobs are aborted.
> 
> IIUC, you can override the timeout setting per job...

IMO, extending the "hard" timeout for the job is not an option anyway,
for the reason presented in the next point:
> 
> > * Even without this limit on runtime, this new check would double
> > to triple
> > the run time of the libVirt pipeline in GitLab.
> 

↑ This one.

> Yeah that's a problem, but bear in mind there are two separate
> scenarios
> 
> Running post merge we need to check all files, but the length of time
> is a non-issue since no one is waiting for a post-merge check to
> complete.

I am vary of the possibility of the pre-merge job running successfully
(e.g. because it ran into timeout before finding a certain issue) and
then failing for the post-merge job. One idea would be to set the post-
merge job to "allow-failure: true", i.e. making it a "notification"
only. But I am absolutely unsure on what the best approach is here,
hence the "RFC".

> Running pre- merge we only need to check files which are modified by
> the patches on the current branch. That ought to be orders of
> magnitude
> faster.

I see where you are coming from, but I believe this does not catch
cases where e.g. a change in a header file leads to a clang-tidy issue
in an otherwise untouched .c file, or vice versa.

The approach to cache files keyed on the result of the preprocessor
stage ("hash(gcc -E file.c)") on the other hand does catch this kind of
changes and reruns clang-tidy for all affected files.

> > * clang-tidy finds a lot of false positives (see link above for
> > explanation)
> > and has checks that contradict the libVirt code style (e.g. braces
> > around
> > blocks). This makes a quite complex configuration in ".clang-tidy"
> > necessary.
> 
> IMHO for code checkers to add real value to people you need the
> existing
> codebase in git to be 100% clean.
> 
> If you're not going to make the existing code compliant, then users
> will
> never be sure of what they should do for new code. We see this all
> the
> time in QEMU where its patch checker complains about problems that
> were
> pre-existing.
>
> If we could get clang-tidy to do a subset of checks where we can make
> libvirt 100% clean

I believe there is a misunderstanding here, please see explanation
below.

> > * I was unable to make clang-tidy recognize the settings from the
> > configuration file for generated files, leading clang-tidy to
> > always add some
> > checks. These checks were among those that produced false
> > positives.
> > 
> > * The list of enabled / disabled checks in the yaml configuration
> > file is a
> > quite long string, making it hard to weave in some comments /
> > documentation
> > on which checks are enabled / disabled for what reason.
> > 
> > This series introduces a new script, "run-clang-tidy.py". This is a
> > replacement for the script of the same name from clang-tools-extra. 
> > It offers
> > parallel execution, caching of results and a configurable soft-
> > timeout.
> > 
> > Please see the individual commits for more details. Comments
> > welcome.
> > 
> > https://gitlab.com/twiederh/libvirt/-/pipelines/255321968 → "clang-
> > tidy".
> 
> What am I missing here - that job just seems to show a list of
> filenames, but isn't reporting any issues for them ?

There are three "types" of checks:
* Irrelevant checks that we want to disable permanently, e.g. checks
for some c++ specific issue that does not apply to c code,
* Checks that the libvirt code base currently passes,
* Checks that the libvirt code base currently fails.

Path #09 disables the first and last category of checks. Note that the
list is not defined in terms of "enable these checks" but "disable
these checks". The idea is to have checks that are introduced in new
versions of clang-tidy enabled by default and disabled checks commented
on and justified.

This is why you see no findings in the output -- there are none,
currently only "passing" checks are enabled. I disabled all failing
checks indiscriminately, as a discussion of which particular checks are
actually useful and which are not would distract from the general
question of "Do we want clang-tidy integration in the CI and if so,
how".

See for example the check "readability-braces-around-statements" [1],
which mandates braces around single-line "if" or "while" blocks, which
contradicts the libvirt code style. I believe that the verdict for this
check will be permanent disablement. Other checks, for example "cert-
err34-c" [2] are issues that in my opinion should be fixed and I
started doing so already, see [3].


For comparison, have a look at [4]: This is the output of run-clang-
tidy.py with only the "irrelevant" checks disabled.

> 
> Regards,
> Daniel

Regards,
Tim

[1] 
https://clang.llvm.org/extra/clang-tidy/checks/readability-braces-around-statements.html
[2] https://clang.llvm.org/extra/clang-tidy/checks/cert-err34-c.html
[3] 
https://listman.redhat.com/archives/libvir-list/2021-January/msg01152.html
[4] https://gitlab.com/twiederh/libvirt/-/jobs/1026797599

Re: [libvirt PATCH 00/10] [RFC] clang-tidy CI integration
Posted by Ján Tomko 3 years, 1 month ago
On a Friday in 2021, Tim Wiederhake wrote:
>"clang-tidy" is a static code analysis tool for c and c++ code. See
>https://listman.redhat.com/archives/libvir-list/2021-January/msg01152.html
>for some issues found by clang-tidy and more background information.
>
>Meson has support for clang-tidy and generates target named "clang-tidy" if
>it finds a ".clang-tidy" configuration file in the project's root directory.
>
>There are some problems with this approach though, with regards to inclusion
>in GitLab's CI:
>
>* Single-threaded runtime of a complete scan takes between 95 and 110 minutes,
>depending on the enabled checks, which is significantly longer than GitLab's
>maximum run time of 60 minutes, after which jobs are aborted.
>

That does seem like a deal-breaker. Did you manage to do a successful
run by iteratively building up the cache over multiple jobs? Or by doing
a multi-threaded run?

>* Even without this limit on runtime, this new check would double to triple
>the run time of the libVirt pipeline in GitLab.
>

Running it for every job sounds wasteful - it can be run on schedule
like the coverity job - daily or even weekly.

Also, please never write libvirt like that again.

Jano

>* clang-tidy finds a lot of false positives (see link above for explanation)
>and has checks that contradict the libVirt code style (e.g. braces around
>blocks). This makes a quite complex configuration in ".clang-tidy" neccessary.
>
>* I was unable to make clang-tidy recognize the settings from the
>configuration file for generated files, leading clang-tidy to always add some
>checks. These checks were among those that produced false positives.
>
>* The list of enabled / disabled checks in the yaml configuration file is a
>quite long string, making it hard to weave in some comments / documentation
>on which checks are enabled / disabled for what reason.
>
>This series introduces a new script, "run-clang-tidy.py". This is a
>replacement for the script of the same name from clang-tools-extra. It offers
>parallel execution, caching of results and a configurable soft-timeout.
>
>Please see the individual commits for more details. Comments welcome.
>
>https://gitlab.com/twiederh/libvirt/-/pipelines/255321968 → "clang-tidy".
>
>Tim
>
>Tim Wiederhake (10):
>  clang-tidy: Add a simple runner
>  clang-tidy: Run in parallel
>  clang-tidy: Filter output
>  clang-tidy: Add cache
>  clang-tidy: Add timeout
>  clang-tidy: Allow timeouts
>  clang-tidy: Add shuffle
>  clang-tidy: Make list of checks explicit
>  clang-tidy: Disable irrelevant and failing checks
>  clang-tidy: Add CI integration
>
> .gitlab-ci.yml            |  88 ++++++
> scripts/run-clang-tidy.py | 557 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 645 insertions(+)
> create mode 100755 scripts/run-clang-tidy.py
>
>-- 
>2.26.2
>
>
Re: [libvirt PATCH 00/10] [RFC] clang-tidy CI integration
Posted by Tim Wiederhake 3 years, 1 month ago
On Fri, 2021-02-12 at 14:48 +0100, Ján Tomko wrote:
> On a Friday in 2021, Tim Wiederhake wrote:
> > "clang-tidy" is a static code analysis tool for c and c++ code. See
> > https://listman.redhat.com/archives/libvir-list/2021-January/msg01152.html
> > for some issues found by clang-tidy and more background
> > information.
> > 
> > Meson has support for clang-tidy and generates target named "clang-
> > tidy" if
> > it finds a ".clang-tidy" configuration file in the project's root
> > directory.
> > 
> > There are some problems with this approach though, with regards to
> > inclusion
> > in GitLab's CI:
> > 
> > * Single-threaded runtime of a complete scan takes between 95 and
> > 110 minutes,
> > depending on the enabled checks, which is significantly longer than
> > GitLab's
> > maximum run time of 60 minutes, after which jobs are aborted.
> > 
> 
> That does seem like a deal-breaker. Did you manage to do a successful
> run by iteratively building up the cache over multiple jobs? Or by
> doing
> a multi-threaded run?
> 

I combined the soft-timeout (./scripts/run-clang-tidy.py --timeout ...)
together with caching. What this does is to stop scanning after the
specified amount of time, but will keep the cache.

Consecutive runs will pick up the cache and hopefully get further down
the list of scanned files before running out of time. This process
converges at a point where there are so few (if any) uncached results
left, that the scan completes.

The benefit of this approach is, that it actually works, takes no
longer than 30 minutes and does not require one to click "re-run"
several times until the job becomes green. And in the best case where
no code file changed aka. all cache files are recent, this reduces the
run time of the clang-tidy job to roughly three minutes; see link to
pipeline below. The first iteration did run into timeout, the second
one finished within the time window. I clicked "rerun" a second time to
get the minimum required time I quoted above.

The downside of this approach is, that for "incomplete" runs, aka when
we hit the timeout, not all files are scanned and the code may still
contain issues that would be flagged in a consecutive run.

> > * Even without this limit on runtime, this new check would double
> > to triple
> > the run time of the libVirt pipeline in GitLab.
> > 
> 
> Running it for every job sounds wasteful - it can be run on schedule
> like the coverity job - daily or even weekly.
> 
> Also, please never write libvirt like that again.
> 
> Jano
> 

I have no idea what I mixed up libvirt with, which is capitalized this
way. Good thing it's Friday.

Tim

> > * clang-tidy finds a lot of false positives (see link above for
> > explanation)
> > and has checks that contradict the libVirt code style (e.g. braces
> > around
> > blocks). This makes a quite complex configuration in ".clang-tidy"
> > neccessary.
> > 
> > * I was unable to make clang-tidy recognize the settings from the
> > configuration file for generated files, leading clang-tidy to
> > always add some
> > checks. These checks were among those that produced false
> > positives.
> > 
> > * The list of enabled / disabled checks in the yaml configuration
> > file is a
> > quite long string, making it hard to weave in some comments /
> > documentation
> > on which checks are enabled / disabled for what reason.
> > 
> > This series introduces a new script, "run-clang-tidy.py". This is a
> > replacement for the script of the same name from clang-tools-extra. 
> > It offers
> > parallel execution, caching of results and a configurable soft-
> > timeout.
> > 
> > Please see the individual commits for more details. Comments
> > welcome.
> > 
> > https://gitlab.com/twiederh/libvirt/-/pipelines/255321968 → "clang-
> > tidy".
> > 
> > Tim
> > 
> > Tim Wiederhake (10):
> >  clang-tidy: Add a simple runner
> >  clang-tidy: Run in parallel
> >  clang-tidy: Filter output
> >  clang-tidy: Add cache
> >  clang-tidy: Add timeout
> >  clang-tidy: Allow timeouts
> >  clang-tidy: Add shuffle
> >  clang-tidy: Make list of checks explicit
> >  clang-tidy: Disable irrelevant and failing checks
> >  clang-tidy: Add CI integration
> > 
> > .gitlab-ci.yml            |  88 ++++++
> > scripts/run-clang-tidy.py | 557
> > ++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 645 insertions(+)
> > create mode 100755 scripts/run-clang-tidy.py
> > 
> > -- 
> > 2.26.2
> > 
> >