Freshmeat spiel

Myer supports contemplative review
of C source code. It is for maintainers who know their program
"too well" and need to see it from a different angle. It colorizes
identifiers and constants to show their marginal cost to the program's
coupling and cohesion metrics. Myer is based on gcc and accepts
the same C dialect. It "runs the preprocessor in reverse",
propagating info from the parse tree back to spots in the .c and .h
files. Output is HTML.

Version 20031129 replaced some hacks with solid code (scoping of like-named
global variables, handling of type names).
Version 20031031 now supports gcc-3.3.2, uses much less swap, and fails
less often when merging header files.
Version 20031017 first release.

Please send your opinions of this project, so I can prioritize the to-do
list.

Example output

Cohesion is indicated by intensity: low cohesion
= bright, high cohesion = dim. Cohesion does not apply to
function-locals.

Compatible systems

Myer was developed on RedHat GNU/Linux
8.0. It should run on any Unix-like system that's supported
by gcc.

License = GPL. Some of Myer's code
is combined with gcc code in the same executable.

Installation

Download and unpack a copy of the gcc
"core" compiler. Here
is gcc-3.3.2 (recommended), while here
is gcc-3.2 (also works). To use other gcc revs, you'll
have to interpolate the patch files.

For a minimal installation of gcc for
Myer, execute in the gcc download directory:

./configure --enable-languages="c"

make all-libiberty

Download and unpack a copy of Myer (see link at beginning of this
document).

Alter the Makefile for Myer:

Change GCC_SRC to point to
your downloaded copy of gcc.

Change GCC_PATCH if you're using gcc-3.2 instead
of gcc-3.3.2. Patch files are supplied only for those two revs!

[Optional] change GCC_SYSINCLUDE
to point to your installed gcc library, but only if your installed
gcc is version 3.0 or later and you intend to delete your gcc download
directory after installing Myer.

[Optional] change OPT
if you want Myer to run even faster.

Run make in Myer's directory.

[Optional] To install Myer, copy ./myer
to the desired directory, then copy ./gcc_patch/cc1
to the same directory and rename it myer_cc1.

Copies some gcc source files to gcc_patch
and patches them. These patches add a new compiler option
-d@ which prints (line,col) and UID for each identifier
and constant.

Compiles and links the phase-1 compiler
./gcc_patch/cc1.

Creates the soft link ./myer_cc1
which points to ./gcc_patch/cc1.

Creates myerenv.c with a list of the standard predefined
cpp macros for your system.

Compiles and links Myer to produce program ./myer.

Invocation

For simple projects:

/path/myer
*.[ch]
This creates a subdirectory "Myer" containing
colorized HTML versions of your files. It works best when
all the source files in the directory are part of one program. "/path/"
is where you have put your Myer executable.For projects with Makefiles:

make
rm *.o make CC=/path/myer
This creates subdirectories "xxx_Myer"
for each program xxx that the Makefile creates.
This technique requires that your Makefile be assiduous in
always using "$(CC)", not "cc" or "gcc" to invoke the compiler,
that each program contains at least one module that is separately compiled
to .o before being linked as an executable, and that you do not combine
object files into .a archives before linking. Myer contains
hacks for dealing with ".o" file extensions to make this technique
work. This invocation mode colorizes only the .c and not the
.h files; sorry.The full monty:

/path/myer [options]
files
Myer accepts a few options of its own, then passes
the rest to gcc. Files with ".c" extensions are passed to
gcc for parsing (using the new -d@ option in gcc). Files
with ".myerN" extensions are parsed internally (see -Pn
and -o below). For a filename ending with ".o", Myer
will instead parse the corresponding file with ".myer2" extension. Other
filenames are assumed to be headers and are processed only if some
parsed file referred to them via #include.Myer options:

-Pn: Produce output for phase
n. There are six phases (see discussion below).
The most useful phases are -P6 (final HTML output)
and -P2 (the last phase where each .c file can be processed
separately). Default is -P6 unless -o
specifies a filename with a ".o" extension; in that case the default
is -P2. After you create a phase file with a .myerN
extension, you can send the file back into Myer on a later run to
continue processing from that point; if N = 3, 4, or 5 (which
describe entire programs), you can't send in any other files on that
run.

-oname: Send output to file name.
Default is "Myer/" for phase 6 (which needs an
output *directory*); or stdout for other phases. If name
ends with ".o", it is changed to ".myerN" where N
is the selected phase; if no phase is selected, -P2 is assumed.
If name has no final slash and phase 6 is selected,
"_Myer/" is appended to name to make the output directory.

-v: Print verbose progress reports during
execution.

-c: Ignored for compatibility with Makefiles.
Myer does not run ld.

Myer's phases

Phase

Function performed

Limiting goal

1 "parse"

Parse C into a stream of
tokens that associate identifiers and constants with both their
text position (line,col) in the compilation unit and their semantic
units (= UIDs) in gcc's parse tree.

Still recognizable as a
stream of C code.

2 "token"

Fixup the token stream:
sort by (line,col), merge duplicate tokens in macro definitions,
deal with generated identifiers from macro calls. Split the
token stream by file of origin and convert compilation-unit line
numbers back to input-file line numbers.

Still processing each compilation
unit separately.

3 "merge'

Combine compilation units.
Merge the token-streams for header files mentioned in
several units.

Still generic C processing.

4 "sum"

Produce summary counts
of various things, that will be used by phase 5.

Still no meat.

5 "calc"

Calculate marginal costs
to coupling/cohesion for each identifier and constant.

Customization of parameters

This is an alpha rev! For now, you'll have
to edit the program's header files. The parameters all have
names starting with "DEFAULT_", in case the program someday
acquires command-line options to override these defaults.Parameters in file myer.h:

DEFAULT_SCALE_COUPLING: Specifies the
numerators in the three formulas for the marginal coupling cost
of an item. Denominators: how many modules refer to its defining
module, how many functions in this module refer to items in its module,
how many items from its module are referenced from this function. Total
coupling is the sum of these three fractions. The default numerators
(35%, 50%, 15%) need more research, as does the sum-of-fractions formula.
DEFAULT_SCALE_COHESION: Specifies the numerators
in the three formulas for the marginal cohesion cost of an identifier.
Denominators: total references from all modules, total
references from this module, total references from this function.
Total cohesion is the sum of these three fractions. A
*high* value for the total indicates *low* cohesion cost. The
default numerators (35%, 50%, 15%) need more research, as does the
sum-of-fractions formula.

DEFAULT_OUTPUT_DIRECTORY: Name of output
directory when no -o option is used. This is also
the suffix used (with a preceding underscore) when -o
specifies a name without a final slash. Default value is
"Myer/".Parameters in file myerhtml.h:

DEFAULT_GAMMA_COUPLING: Specifies the exponent
in the equation
html_coupling
= orig_coupling(1/gamma)
A gamma > 1 reduces the visual differences
between identifiers that have high coupling cost, while magnifying
the differences among low-coupling items. This correction
is needed partly because of the logarithmic response of the eye,
but mostly because of the sum-of-fractions nature of the marginal-cost
formulas. Likely values for marginal cost of coupling include
1, ½, 1/3, ¼,
etc., where the denominator is the number of module items referenced.
It seems wasteful to assign half the color space to the difference
between the first two likely values! The default coupling gamma
(1.75, needs more research) changes the likely-value steps to 1.0, 0.67,
0.53, 0.45, etc.

DEFAULT_COLOR_GLOBAL: How to compute
a foreground color from coupling and cohesion values. The
default is a simplistic formula where color varies from blue to cyan
as coupling varies from 0.0 to 1.0, while brightness varies from 40%
to 100% as cohesion varies from 0.0 to 1.0. Notes:
[1] The eye perceives a cyan
color as much brighter than a blue at the same rate of photon emission.
[2] At equal RGB brightness,
a cyan color (= blue + green phosphors) involves twice as many
photons as a blue (= just blue).
Maybe the formula should use
the CIELuv colorspace instead of RGB to make the color
and brightness formulas independent?

DEFUALT_COLOR_MODLOCAL: How to compute
a foreground color for a module-local item, which has only cohesion
and no coupling. The default is a simplistic formula that
uses varying brightness of magenta (equal amounts of blue and
red).

DEFAULT_HTML_STARTCONST and DEFAULT_HTML_ENDCONST:
Some HTML code to emit before and after literal constants to
distinguish them from identifiers. The default gives them
a peach-colored background.

C dialect limitations

Myer should accept any C program that gcc will
accept, but doesn't. Here are the gcc features known to
be unsupported by Myer:

Identifiers longer than 4095 characters.

Programs that contain more than 65535 .c
and .h files.

Files that #include more than 1000
other files.

Nested function definitions (a gcc extension
to standard C).

#include within a function body.

Using a macro to select the name of another macro
that takes arguments: #define croak(arg) exit(arg)
#define die croak
main() { die(1); }

To-do list

Better marginal-cost formulas. I find
the program's current output not as pleasing as I had hoped.

Do a better job of merging .h files from
different compilation units. When myer analyses gcc, it fails
to merge about 5% of the header files.

Originally, Myer was supposed to be much more ambitious.
It would use lateral inhibition to sharpen the border between
bug-prone areas and okay areas. The marginal-cost formulas
would be recursive, so calls to functions that manipulate "more
of the same" objects would be dimmer than calls that do something
totally different. Most of the output would be gray except
for a few bright spots for code that was truly worthy of a second look.
That didn't happen, because I couldn't think of suitable formulas
to make it happen. But Torvalds says, "Publish early and often"...

Instead of linear RGB values, perhaps group
the results into "very low coupling", "low coupling", "medium
coupling", etc. and use fewer, more distinctive colors. Or
maybe this is just a problem for my laptop LCD?

Better format for .myer5 files. It's
hard to get one's bearings with the current format.

Accept .myer5 files as input. I left this
out because I don't like the current format.

Write an autoconf script for portability.

Add a way to include header files in the
output when invoked via "make CC=myer".

Think of some way to handle #line
directives, which are currently ignored.