JavalibSawja

Nit: Nullability Inference Tool

Presentation

Nit is infers nullness properties of variables and can prove that a
program is NullPointerException free. It allows to find suitable nullability
annotations for fields, method parameters, return values and local variables.
It works at the bytecode level (on .class files or .jar files) so it can be
used on programs where the source is not available. While this can look strange
for a programmer, this tool can also be used by other static analyses to
improve their precision (through annotation).
The main improvement over the last version of Nit (website of old version) is that instead of using
directly the Bytecode representation, it benefits of the Sawja JBir
representation. It leads to more simple and easy to maintain code. Moreover it
drastically reduces the time and memory consumption of an analysis.

Licence

This software is distributed under the GNU General Public License.

News

October 2013

Nit 0.6 is out. This is a rewriting of Nit
with the facilities of Sawja

September 2009

Nit 0.5d is out. This is a bugfix release.
Check the CHANGELOG file for all the details.

June 2009

Nit and the brand new Eclipse plugin are at
JavaOne,
at the INRIA booth.

the use
of JavaLib 1.8
which improves the performances and the precision (call graph
computed with RTA instead of CHA);

the addition of an output format compatible with the
Annotation File Utilities
(http://types.cs.washington.edu/annotation-file-utilities/),
which should allow to export computed annotations back into
the Java source code;

the compatibility with Cygwin, which allows to compile and
use Nit on Microsoft Windows.

Check the CHANGELOG file for all the details.

October 2008

Nit 0.4 is out. The major improvement is
in the use of JavaLib
1.7 which improves performances by more than 7% and reduce memory
consumption. Some improvements have also been made on
precision. Check the CHANGELOG file for all the details.

August 2008

Nit 0.3 is out. The precision has been
greatly improved: the new must-alias analysis is more powerful and a
new additional analysis allows to recover information from the
instranceof instruction (do not worry if you do not understand: it
just means it is more precise). The other good news: it uses
even less memory in the general case.

June 2008

A brand new web site with a lot more than
before in the documentation section.

June 2008

Nit 0.2 is out. This version contains much
more documentation (which is easy as the previous ones were
almost undocumented) and a bug found by Fausto Spoto in the
statistics computation has been fixed.

Compilation from source

You can the compile with make. The output is the file nit.opt. If
something did not work as expected, make sure that Sawja is correctly installed and recognised by ocamlfind.

User documentation

Command line options

Usage: nit [options ... ] --main <string>

nit infers nullness properties for the reference variables of a Java
program and can therefore be used to infer null-ability annotations for fields,
local variables, methods parameters and return values.

It takes options but no argument and analyzes the program starting from the
given main method (with --main option). At least one
output option must be given.

Options:

--classpath <string>

It takes a list of directories and .jar files concatenated with ':'
(or ';' under Windows) where needed classes (including boot strap
classes) will be looked for. It should therefore contain the path of
your runtime (or bootclasspath in Java). If no --classpath option is given, then the environment
variable CLASSPATH is used.

--main <string>

Set the main method of the program. It takes as argument a class
name (such as foo.Bar) optionally followed by a method signature in
the JVM (internal) syntax (such as '.m(IIL)V'). If no method
signature is given, the default is '.main([Ljava/lang/String;)V',
which corresponds to the void main(String[]) method. This option must
be specified at least once and can be specified more than once.

--heap-input <file>

Try to use as an heap dump file and init the Nit initial
state with the information from this file.

--html-output <directory>

Outputs
the result as HTML code in the given directory (including annotated
opcodes).

--xml-output <f.xml>

Output the result as XML in the given file (excluding opcodes, i.e. only method and field annotations).

--anno-file-output <f.jaif>

Output the result (only method and field annotations and only Nullable/NonNull, ie no raw annotations) in the given file in a format suitable for use with the Annotation File Utilities (http://types.cs.washington.edu/annotation-file-util ities/).

--warnings

Output warning for each unsafe dereference on the standard output (stdout).

--version

Print version information and exit.

--no-others-entryPoints

Dont add any implicit entrypoints (internal to the JVM).

--debug

Print every debugging information.

--verbose

Print some debugging information.

-v

Print some debugging information.

--no-compact

Avoid compacting the result of intermediate analyses (It can sometimes save time but always costs memory).

--remove-package <string>

It takes a comma separated list of package names (e.g. "java,sun" or "java.lang") and removes all classes in these packages from the output. This option can be specified multiple times. Note: these packages are still analyzed, the option just removes the information concerning these packages from the output.

--class-analysis

{RTA|CRA} Either RTA or CRA (RTA is the default option). It selects the analysis used to load the program (RTA is much more precise than CRA but is also very sensitive to native methods and may therefore produce incomplete programs) (the type system then adds co-variant and contra-variant constraints on types so it does not change the analysis).

--no-field-annotation

Deactivate field annotations. It reduces the precision of the analysis without significantly improving time and memory consumption.

--no-method-annotation

Deactivate method annotations. It reduces the precision of the analysis without significantly improving time and memory consumption.

--unsafe-exception

Annotate caught exception object as if they were always initialized (NonNull) objects. It improves the results but it is incorrect (as it is possible to throw the current object this in a constructor).

--unsafe-native

Annotate native methods as if they were always returning initialized objects (NullableInit). It improves the results but it is incorrect.

--unsafe-static

Annotate static fields as if no class initializer could initialize a static field to not-fully initialized value, i.e to a Raw value. It improves the results but it is incorrect.

--unsafe-array

Annotate arrays as if none of their cells were ever assigned a Raw value. It improves the results but it is incorrect.

--aggressive-native

Annotate native methods as if their return value were always NonNull. It improves the results but it is incorrect.

--aggressive-array

Annotate arrays as if all their cell were cells always NonNull. It improves the results but it is incorrect.

--aggressive-static

Annotate static fields as if they were always initialized by their class initializer with a NonNull value. It improves the results but it is incorrect.

--aggressive-field

Annotate instance fields as if they were always NonNull. It improves the results but it is incorrect.

--aggressive-parameter

Annotate method parameters as if they were always NonNull. It improves the results but it is incorrect.

--aggressive-return

Annotate return value of methods as if they were always NonNull. It improves the results but it is incorrect.

-help

Display this list of options

--help

Display this list of options

Output formats

XML output format

In the following grammar, the star (*) means that the previous token
is present any number of times (including zero). The question
mark (?) means that the previous token is optional. The plus sign
(+) means that the previous token is present any number of times and
at least once. The parentheses are not tokens, they are here to
group tokens together so the following star, question mark or
plus applies to the list of tokens.

< and > in method names are respectively replaced by ^ and $ in the
XML output. Method and field descriptors are composed of the name of the
method or field, respectively, and their "type" (or descriptor) as documented
in the JVM Specification
(Field
Descriptors,
Method
Descriptors).

Annotation File format (produced with option --anno-file-output))

Jaime Quinonez has developed an annotation file format and a tool suite
(which can be found at
http://types.cs.washington.edu/annotation-file-utilities/) that allows to
extract annotations from Java source file or Java bytecode (.class) files to
this file format and to insert annotations for a such file into Java source or
Java bytecode files. This tool suite is supposed to work on the annotations
defined in the JSR 308, which should be part of Java 7. It now seems to be
maintained by Michael Ernst.

HTML output formats

The text output is in the JBir intermediate language and should be human
readable (it is easier than the bytecode).

Field declarations are annotated with annotation starting with @ written
just after the field name. Method annotations are written after the field
signature. They include the list of parameters, including the self reference
in the case of a virtual method. One-word non-reference type are annotated with
NonNull, two-word non-reference type are annotated
with two NonNull annotations.

Instruction are also annotated with the flow-sensitive information
computed by the analysis:

this shows the set of object fields definitely set at that point
(Bot or Top meaning all or none of them, respectively)

locals is the annotations for local variables

In addition to those annotations computed by the analysis (the real
purpose of the program) and to the information javap
-verbose -private usually outputs, there are also some other
information outputted to help to navigate in the program and to understand
the computed annotations. Their name should be sufficiently meaningful to be
understandable without further documentation. Here is the kind of information
added for each kind of data.

Data

Information added

Class

The direct sub-classes

Interface

The direct implementations (sub-classes) and direct
sub-interfaces.

Non-abstract method

The interfaces that define this method and the classes in
which this method is overridden.

Abstract method

The class or interface that define a method that this
method definition overrides, the class or interface that
overrides this method definition and the class that implements
this method.

Statistics output format

Statistics can be made on the result of the analysis via
option --stats. It counts
dereferences in field reads
(getfield), field writes
(putfield) and virtual method calls
(invokevirtual,
invokeinterface and
invokespecial) of each type
(@Nullable,
@NonNull,
@Raw). It also counts annotations of
each type for fields, method formal parameters and return values
(including static methods) if the Java type is of type reference
(object or array) and if the code is not proved dead (field
actually read, method actually called).

The output format is the CSV format where the first line is the
caption and the second line contains the data. The column
separator is the ';' character. The title of each column is
explained here.

Field name

Description

Name

Gives the name of the main class of the program that has
been analyzed.

Free

Unused.

RAW Fields

Number of fields annotated as Raw.

MBN fields

Number of fields annotated as Nullable.

NN fields

Number of fields annotated as NonNull.

total invokes

Total number of method calls taken into account (reachable
invokevirtual,
invokespecial,
invokeinterface and
invokestatic).

Raw invokes

Number of virtual method calls on reference annotated
as Raw.

NN invokes

Number of virtual method calls on reference annotated
as NonNull.

MBN invokes

Number of virtual method calls on reference annotated
as Nullable.

invokes Raw

Number of method calls that return values annotated
as Raw.

invokes NN

Number of method calls that return values annotated
as NonNull.

invokes MBN

Number of method calls that return values annotated
as Nullable.

total arguments

Total number of actual arguments taken into account
(i.e. value of type reference that have been passed as
arguments of method calls taken in account). Note: Total
arguments = Raw arguments + MBN arguments)

Raw arguments

Number of arguments that are annotated as
Raw.

MBN arguments

Number of arguments that are annotated as
Nullable.

total getfields

Total number of field readings taken into account. Note:
Total getfields = getfields on MBN + getfields on Raw +
getfields on NN.

getfields on MBN

Number of field reading on values annotated
as Nullable.

getfields on Raw

Number of field readings on values annotated
as Raw.

getfields of MBN

Number of field readings that have return values annotated
as Nullable.

getfields of RAW

Number of field readings that have return values annotated
as Raw.

getfields of NN

Number of field readings that have return values annotated
as NonNull.

total putfields

Total number of field writings that have been taken into
account. Note: Total putfields = putfields on MBN +
putfields on Raw + putfields on NN.

putfields on MBN

Number of field writings on values annotated as
Nullable.

putfields on RAW

Number of field writings on values annotated as
Raw.

putfields of MBN

Number of field writings where written values are annotated
as Nullable.

putfields of RAW

Number of field writings where written values are annotated
as Raw.

putfields of NN

Number of field writings where written values are annotated
as NonNull.

total parameters

Total number of formal parameters taken into account
(i.e. formal parameters of type reference of methods (virtual
or static) that have been taken into account).Note: Total
parameters = MBN parameters + Raw parameters + NN
parameters.

MBN parameters

Number of formal parameters annotated as
Nullable.

Raw parameters

Number of formal parameters annotated as
Raw.

total return

Total number of return annotations taken into account
(i.e. one for each method taken in account if the return
type of the method is of type reference). Note: Total
return = MBN return + NN return.

MBN return

Number of returns annotated as
Nullable.

Raw return

Number of returns annotated as
Raw.

MBN ArrayLoad

Number of array loads on array references annotated as
Nullable.

NN ArrayLoad

Number of array loads on array references annotated as
NonNull
or Raw.

MBN ArrayStore

Number of array stores on array references annotated as
Nullable.

NN ArrayStore

Number of array stores on array references annotated as
NonNull
or Raw.

MBN ArrayLength

Number of arraylength
instructions on array references annotated as
Nullable.

NN ArrayLoad

Number of arraylength
instructions on array references annotated as
NonNull
or Raw.

Mem usage (MB)

Maximum quantity of memory used by the process during its
whole life in megabytes.

Time (s)

Quantity of CPU time used by or for the process (user time
and system time) in seconds.

Version

A string with the compile-time options that were used when
building the program and the command line arguments given
for that execution. (The version is to identify the
statistics, not the software version.)