[squid-dev] RFC: Modernizing sources using clang-tidy

Alex Rousskov rousskov at measurement-factory.com
Mon Apr 20 02:02:35 UTC 2020


Hello,

    Squid sources contain a lot of poorly written, obsolete, and
inconsistent code that (objectively) complicates development and
(unfortunately) increases tensions among developers during review.

Some of those problems can be solved using tools that modify sources.
Clang-tidy is one such tool: https://clang.llvm.org/extra/clang-tidy/
It contains 150+ "checks" that can automatically fix some common
problems by studying the syntax tree produced by the clang compiler.
Understanding the code at that level allows clang-tidy to attack
problems that simple scripts cannot touch.

I have not studied most of the clang-tidy checks, but did try a few
listed at the end of this email. You can see the whole list of checks at
https://clang.llvm.org/extra/clang-tidy/checks/list.html


Here are a few pros and cons of using clang-tidy compared to our own
custom scripts:

Pros:

* maintained and improved by others
* can fix problems that our scripts cannot see
* covers a few rules from C++ Core Guidelines and popular Style Guides
* arguably less likely to accidentally screw things up than our scripts

Cons:

* Requires installation of clang, clang-tidy-10, bear. It is not
difficult in a CI environment, but may be too much for occasional
contributors.

* Clang-tidy misses files that do not participate in a specific build
(e.g., misses many compat/ files that are not needed for an actual
build). Applying clang-tidy to all sources will be difficult.

* Clang-tidy misses code lines that do not participate in a specific
build (e.g., lines inside `#if HEADERS_LOG` where HEADERS_LOG was not
defined). Applying clang-tidy to all lines will be impractical.

* Clang-tidy would be difficult to customize or adjust (probably
requires building clang from scratch and writing low-level AST
manipulation code).

* Clang-tidy is relatively slow -- the whole repository scan takes
approximately 15-30 minutes per rule in my limited tests. Combining
rules speeds things up, but it may still be too slow to run during every
PR check on the current CI hardware.

* We do not have any clang-tidy experts on the development team (AFAIK).


I will itemize a few checks that I tried. The "diff" links below show
unpolished and partial changes introduced by the corresponding checks.
If we decide to use clang-tidy in principle, we will need to fine-tune
each check options (at least) to get the most out of the tool.

* modernize-use-override

Adds "override" (and removes "virtual") keywords from class declarations.

This check is very useful not just because "override" helps prevent
difficult-to-detect bugs but because it is very difficult to transition
to using "override" _gradually_ -- some compilers reject class
declarations that have a mixture of with-override and without-override
methods. Moreover, adding override keywords to old class declarations is
rather time-consuming because it is often not obvious (to a human)
whether the class introduces a new interface or overrides and old one.

Diff: https://github.com/measurement-factory/squid/commit/d00d0a8


* performance-...

Clang-tidy has a few checks focusing on performance optimizations. The
following commit shows a combination of the following four checks:
performance-trivially-destructible, performance-unnecessary-value-param,
performance-for-range-copy, performance-move-const-arg

Diff: https://github.com/measurement-factory/squid/commit/1ae5d7c


* modernize-use-nullptr

Replaces NULL and 0 constants with C++11 nullptr:
https://clang.llvm.org/extra/clang-tidy/checks/modernize-use-nullptr.html

While replacing most NULLs is possible with a simple script, this check
may be a better solution because it can safely cover more NULL uses and
also converts bare 0s which would be impossible using a script.

Diff: https://github.com/measurement-factory/squid/commit/4242604


In summary, I think investing in clang-tidy would be worth it because
the tool can address several important problems that we would otherwise
have to leave untreated. It would take some time to agree on a set of
checks and then properly configure/tune each one, but I think it is
doable. I am not sure whether these checks should be applied on each PR
check or periodically, but we can figure it out as we go.

I am not aware of any viable alternatives.

What do you think?


Thank you,

Alex.



More information about the squid-dev mailing list