TRACE tool: Visualization and analysis of concurrent system activities

The TRACE tool helps us to understand complicated behavior over time of all kinds of systems by its domain-independent capabilities to visualize and analyze concurrent activities which are encoded in execution traces. Figure 1 shows a typical Gantt-chart TRACE view.

The activities in the system, named A-G, are represented by colored blocks with a start and end time.

Figure 1: Example visualization of system activities A-G (colored blocks) over time (x-axis).

Understanding system behavior over time

There are many reasons why a system’s behavior over time can become difficult to understand or, worse, confusing, even when the system is performing as designed. An example is a situation in which many concurrent activities share resources. Unforeseen interactions may arise due to specific timing of the activities. Moreover, if the timing of the activities changes (e.g., due to an upgrade of the computational platform), then the interactions may also change, which may result in a significantly different behavior. Having insight into the how and why of a systems behavior over time, however, is of paramount importance for making effective (design) choices and tradeoffs in all phases of the system life-cycle: from the design of a new system to the maintenance of an old legacy system. The TRACE tool can help with this.

Execution traces to capture behavior over time

The TRACE tool works with execution traces. Such an execution trace captures (a single) system behavior over time. In its barest form, these are time-stamped sequences of start and end events of activities. TRACE extends this with (i) concepts from the Y-chart paradigm and (ii) a number of user-defined attributes (e.g., the name of the activity) to tailor a specific problem domain. This concept of execution trace is very generic which makes TRACE widely applicable:

All levels of abstraction: the TRACE format can capture all levels of abstraction: from low-level embedded activities to system-level activities.

Domain-independent: the TRACE format is domain-independent but nevertheless has means to tailor to a specific domain via the user-defined attributes.

Source-independent: TRACE input can be created from any source, e.g., from log files of legacy systems or from a discrete-event simulation model.

Figure 2: The Y-chart method

The Y-chart paradigm decomposes a system into an application that is mapped to a platform which fosters reuse. Furthermore, it defines a feedback cycle to allow systematic design-space exploration (Figure 2). The Y-chart concepts of application, mapping and platform are realized in TRACE by decomposing an activity (e.g., an image-processing computation) into one or more claims on resources for a certain amount of time (e.g., 2 cores of the CPU and 20 MB of RAM for 50 ms). These are the main elements of the execution traces that capture system behavior (star-1 in Figure 2). To provide feedback on the system under analysis (star-2 in Figure 2) TRACE provides extensive visualization and analysis of execution traces.

Visualization and analysis of execution traces

Figure 3: A specialization of Figure 2 for TRACE

The TRACE tool helps to gain insight in the system dynamics of all kinds of systems through visualization and analysis of execution traces (Figure 3). TRACE visualizes concurrent activities in a Gantt-chart-like view which provides coloring, grouping and filtering options. This visualization alone already is very powerful and can bring quick insight into the system dynamics. TRACE also provides several analysis methods, which sets it apart from the many other Gantt-chart visualization tools.

Critical-path analysis can be used to detect tasks and resources that are bottlenecks for performance.

Distance analysis can be used to compare execution traces with respect to structure, e.g. to check a model trace against an implementation trace..

MTL checking provides a means to formally specify and verify properties of execution traces using Metric Temporal Logic. It is useful to express and check, for instance, performance properties such as ``the processing latency is at most 50 ms’’.

The streaming performance DSL is a domain-specific language that captures often-used performance properties for stream-processing systems (e.g., image or video processing), and which eases the use of the MTL checker.

The resource usage feature can quickly give insight in the details of the resource usage.

The TRACE tool and the underlying concepts are relatively easy to learn, obtaining TRACE input is often relatively straightforward, and application of TRACE potentially has great benefits.