I want to gather a few statistical data about a repository to compare it through time. The goal is to know how the usage of specific languages evolved over time, and how the complexity and size of different projects grew or shrank.

There is a great tool called cloc which measures lines of code in different languages. It's a good start, but LOC measure is not very representative. I would like to gather better measures such as, to begin with, logical lines of code and function points, and eventually the cyclomatic complexity.

There are tools for that too:

Python has an excellent radon library which gives LLOCs, cyclomatic complexity, etc., and make it possible to indirectly determine the number of function points.

C# has, obviously, Visual Studio's Code Metrics which also gives detailed information, including ILLOC which, unlike LOC, is quite representative of the size of a project, as well as cyclomatic complexity.

JavaScript has complexity-report which also makes it possible to compute the number of function points, as well as the LLOC and cyclomatic complexity.

PHP seems to have a tool too, which gives both LLOC and the number of function points, as well as cyclomatic complexity, and other information.

What I can't find is a similar tool for Bash. There is a well known ShellCheck static analysis tool, but this is not what I want: ShellCheck rather searches for possible issues with the code, similarly to JavaScript's jslint and C#'s Code Analysis.

So:

Is there a tool which, similarly to cloc, shows LLOC, function points and cyclomatic complexity for dozens of languages?

Or is there such a tool specifically for Bash scripts?

Note: I'm interested in a free tool which can be used from a Linux terminal, not paid products, and not online services or APIs.

@SteveBarnes: while it's slightly more relevant compared to cloc and, more importantly, adds cyclomatic complexity, it lacks LLOC and function points. On the other hand, it's not clear to me if LLOC is really much more relevant than LOC for Bash, and function points may not be relevant either (for instance for large scripts containing no functions). I suggest that we wait for a few days for other answers, and if there are none, close my question as a duplicate.
– Arseni MourzenkoOct 22 '16 at 17:15

2 Answers
2

[2 months and no responses. I'm providing a commercial answer since
no other answers seem forthcoming.]

Our Source Code Search Engine (SCSE) is used to search large repositories containing many (arguably dozens) languages for interesting code idioms. It is fast because it indexes the code base according to the lexical syntax of each of the languages; there is a language-precise lexer it uses for each language. (This is a Windows product, but has explicitly been packaged to also allow operation with Wine, with shell scripts that make it look native to Linux).

This would cover the languages in your code base except for Bash.
(Not off-the-shelf, but SCSE got the way it is by have a process for defining such lexers, and it would be possible to define a precise lexer for Bash). However, one of the available lexers is for something we call AdhocText, which is intended to be the programming language you find in a random computer programming book, so it contains all the classic lexemes you expect to find in a generic language. This works better than you might expect on a random programming language.

A messy problem with a big code base is categorizing the files according to what languages, to associate each file with its corresponding language lexer. We have another File Inventory tool, that can be pointed at a set of directories, classifies files according to extension and content hints, and then re-validates the classification using the very same lexers used by the SCSE. Running this tool basically takes an completely disorganized set of directories, bins the files according to types, identifies duplicates, and generates the configuration files to run SCSE.

Summary:

SCSE is a tool which computes XML files containing LLOC, cyclomatic complexity for dozens of languages

It uses precise lexers to process the source files, according to language type

It can handle Bash (or other langauges unknown to it) as Adhoc Text; alternatively, it would be possible to define a language-precise lexer for Bash

A FileInventory tool can classify a large set of files in preparation for use with SCSE