ROOT: An Object-Oriented Data Analysis Framework

ROOT is a system for large scale data
analysis and data mining. It is being developed for the analysis of
Particle Physics data, but can be equally well used in other fields
where large amounts of data need to be processed.

After many years of experience in developing interactive data
analysis systems like PAW and PIAF (see Resources), we realized
that the growth and maintainability of these products, written in
FORTRAN and using 20-year-old libraries, had reached its limits.
Although still popular in the physics community, these systems do
not scale up to the challenges offered by the next generation
particle accelerator, the Large Hadron Collider (LHC), currently
under construction at CERN, in Geneva, Switzerland. The expected
amount of data produced by the LHC will be on the order of several
petabytes (1PB = 1,000,000GB) per year. This is two to three orders
of magnitude more than what is being produced by the current
generation of accelerators.

Therefore, in early 1995, Rene Brun and I started developing
a system, intending to overcome the deficiencies of these previous
programs. One of the first decisions we made was to follow the
object-oriented analysis and design methodology and to use C++ as
our implementation language. Although all of our previous
programming experience was in FORTRAN, we soon realized the power
of OO and C++, and after some initial “throw-away” prototyping,
the ROOT system began to take shape.

In November 1995, we gave the first public presentation of
ROOT at CERN and, at the same time, version 0.5 was released via
the Web. By then, Nenad Buncic and Valery Fine had joined our
team.

Since the initial release, there has been a constantly
increasing number of users. In response to comments and feedback,
we've been regularly releasing new versions containing bug fixes
and new features. In January 1997, version 1.0 was released and in
March 1998 version 2.0. Since the release of version 1.0, more than
9,300 copies of the ROOT binaries have been downloaded from our web
site, about 500 people have registered as ROOT users, and the web
site gets up to 100,000 hits per month.

ROOT is currently being used in many different fields such as
physics, astronomy, biology, genetics, finance, insurance,
pharmaceuticals, etc.

The source and binaries for many different platforms can be
downloaded from the ROOT web site (http://root.cern.ch/). The
current version can be used and distributed freely as long as
proper credit is given and copyright notices are maintained. For
commercial use, the authors would like to be notified.

Remote database access, either via a special daemon
or via the Apache web server

Ported to all known UNIX and Linux systems and also
to Windows 95 and NT

The complete system consists of about 450,000 lines of C++
and 80,000 lines of C code. There are about 310 classes grouped in
24 different frameworks, each class represented by its own shared
library.

The CINT C/C++ Interpreter

One of the key components of the ROOT system is the CINT
C/C++ interpreter. CINT, written by Masaharu Goto of Hewlett
Packard Japan, covers 95% of ANSI C and about 85% of C++. Template
support is being worked on, and exceptions are still missing. CINT
is complete enough to be able to interpret its own 70,000 lines of
C and to let the interpreted interpreter interpret a small
program.

The advantage of a C/C++ interpreter is that it allows for
fast prototyping, since it eliminates the typical time consuming
edit/compile/link cycle. Once a script or program is finished, you
can compile it with a standard C/C++ compiler
(gcc) to machine code and enjoy
full machine performance. Since CINT is very efficient (for
example, for/while loops are byte-code compiled on the fly), it is
quite possible to run small programs in the interpreter. In most
cases, CINT outperforms other interpreters like Perl and
Python.

Existing C and C++ libraries can easily be interfaced to the
interpreter. This is done by generating a dictionary from the
function and class definitions. The dictionary provides CINT with
all necessary information to be able to call functions, create
objects and call member functions. A dictionary is easily generated
by the program rootcint that uses
the library header files as input and produces a C++ file
containing the dictionary as output. You compile the dictionary and
link it with the library code into a single shared library. At
run-time, you dynamically link the shared library, and then you can
call the library code via the interpreter. This can be a very
convenient way to quickly test some specific library functions.
Instead of having to write a small test program, you just call the
functions directly from the interpreter prompt.

The CINT interpreter is fully embedded into the ROOT system.
It allows the ROOT command line, scripting and programming
languages to be identical. The embedded interpreter dictionaries
provide the necessary information to automatically create GUI
elements like context pop-up menus unique for each class and for
the generation of fully hyperized HTML class documentation.
Furthermore, the dictionary information provides complete run-time
type information (RTTI) and run-time object introspection
capabilities.

Comments

Comment viewing options

I am often a little nervous when approaching a software package for the first time. This is exactly the sort of article to help me overcome my fears. From what I hear, ROOT is probably one of the most powerful databases out there - I am impressed for instance by the way in which the data can be highly compressed - but you also seem to have devoted a huge amount of effort into making it as user friendly as possible.

I have been reviewing my use of databases over the last few days because of some concern about MySQL, these concerns being prompted by the legal action being taken by Oracle against Google, and Oracle's reduction in support for OpenSolaris. I had never considered ROOT as an alternative to MySQL, until recently thinking of it as a rather specialised tool. I was pleased to note however that there are bindings for Ruby amongst other languages.

I wonder about your figures for OS shares? I have just installed ROOT, but instead of downloading it from its official site I have installed it from the Debian repository. If many people do this, then the numbers you record for Linux usage may underestimate its Linux usage.