1. Introduction

Complexity measurement tools provide several pieces of information.
They help to:

locate suspicious areas in unfamiliar code

get an idea of how much effort may be required to understand that code

get an idea of the effort required to test a code base

provide a reminder to yourself. You may see what you’ve written as obvious,
but others may not. It is useful to have a hint about what code may
seem harder to understand by others, and then decide if some rework
may be in order.

But why another complexity analyzer? Even though the McCabe analysis
tool already exists (pmccabe), I think the job it does is too
rough for gauging complexity, though it is ideal for gauging the
testing effort. Each code path should be tested and the pmccabe
program provides a count of code paths. That, however, is not the
only issue affecting human comprehension. This program attempts to
take into account other factors that affect a human’s ability to
understand.

1.1 Code Length

Since pmccabe does not factor code length into its score, some folks
have taken to saying either long functions or a high McCabe score find
functions requiring attention. But it means looking at two factors
without any visibility into how the length is obfuscating the code.

The technique used by this program is to count 1 for each line that a
statement spans, plus the complexity score of control expressions
(for, while, and if expressions). The value for
a block of code is the sum of these multiplied by a nesting factor
(see section nesting-penalty option (-n)). This score is then added to the
score of the encompassing block. With all other things equal, a
procedure that is twice as long as another will have double the score.
pmccabe scores them identically.

1.2 Switch Statement

pmccabe has changed the scoring of switch
statements because they seemed too high. switch statements
are now “free” in this new analysis. That’s wrong, too.
The code length needs to be counted and the code within a switch
statement adds more to the difficulty of comprehension than code at
a shallower logic level.

This program will multiply the score of the switch statement
content by the See section nesting score factor.

1.3 Logic Conditions

‘pmccabe’ does not score logic conditions very well.
It overcharges for simple logical operations, it doesn’t charge for
comma operators, and it undercharges for mixing assignment operators
and relational operators and the and and or logical
operators.

For example:

xx = (A && B) || (C && D) || (E && F);

scores as 6. Strictly speaking, there are, indeed, six code
paths there. That is a fairly straight forward expression that is not
nearly as complicated as this:

and yet this scores exactly the same. This program reduces the cost
to very little for a sequence of conditions at the same level. (That
is, all and operators or all or operators.) so the raw score
for these examples are 4 and 35, respectively (1 and 2 after scaling,
see section --scale).

If you nest boolean expressions, there is a little cost, assuming you
parenthesize grouped expressions so that and and or
operators do not appear at the same parenthesized level. Also
assuming that you do not mix assignment and relational and boolean
operators all together. If you do not parenthesize these into
subexpressions, their small scores get multiplied in ways that
sometimes wind up as a much higher score.

The intent here is to encourage easy to understand boolean expressions.
This is done by,

not combining them with assignment statements

canonicalizing them (two level expressions with all &&
operators at the bottom level and all || operators in the
nested level -\- or vice versa)

parenthesizing for visual clarity (relational operations parenthesized
before being joined into larger && or || expressions)

1.4 Personal Experience

I have used pmccabe on a number of occasions. For a first
order approximation, it does okay. However, I was interested in
zeroing in on the modules that needed the most help and there were a
lot of modules needing help. I was finding I was looking at some
functions where I ought to have been looking at others. So, I put
this together to see if there was a better correlation between what
seemed like hard code to me and the score derived by an analysis tool.

This has worked much better. I ran complexity and
pmccabe against several million lines of code. I correlated
the scores. Where the two tools disagreed noticeably in relative
ranking, I took a closer look. I found that ‘complexity’ did,
indeed, seem to be more appropriate in its scoring.

1.5 Rationale Summary

Ultimately, complexity is in the eye of the beholder and, even,
the particular mood of the beholder, too. It is difficult to
tune a tool to properly accommodate these variables.

complexity will readily score as zero functions that are
extremely simple, and code that is long with many levels of logic
nesting will wind up scoring much higher than with pmccabe, barring
extreme changes to the default values for the tunables.