Scalable distributed monitoring system

Ganglia is a scalable distributed monitoring system for high-performance computing systems, such as clusters and grids. It is based on a hierarchical design targeted at federations of clusters. It relies on a multicast-based listen/announce protocol to monitor the state within clusters and uses a tree of point-to-point connections among representative cluster nodes to federate clusters and aggregate their state. It leverages widely-used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization.
Authors:
--------
Matt Massie <massie@cs.berkeley.edu>
Preston Smith <psmith@physics.purdue.edu>
Steve Wagner
Federico Sacerdoti <fds@sdsc.edu>