dbg

MODULE

dbg

MODULE SUMMARY

The Text Based Trace Facility

DESCRIPTION

This module implements a text based interface to the
trace/3 and the trace_pattern/2 BIFs. It makes it
possible to trace functions, processes and messages on text based
terminals. It can be used instead of, or as complement to, the
pman module.

For some examples of how to use dbg from the Erlang
shell, see the simple example section.

The utilities are also suitable to use in system testing on
large systems, where other tools have too much impact on the
system performance. Some primitive support for sequential tracing
is also included, see the advanced topics section.

EXPORTS

Pseudo function that by means of a parse_transform
translates the literalfun() typed as parameter in
the function call to a match specification as described in
the match_spec manual of ERTS users guide.
(with literal I mean that the fun() needs to
textually be written as the parameter of the function, it
cannot be held in a variable which in turn is passed to the
function).

The parse transform is implemented in the module
ms_transform and the source must include the
file ms_transform.hrl in STDLIB for this
pseudo function to work. Failing to include the hrl file in
the source will result in a runtime error, not a compile
time ditto. The include file is easiest included by adding
the line
-include_lib("stdlib/include/ms_transform.hrl"). to
the source file.

The fun() is very restricted, it can take only a
single parameter (the parameter list to match), a sole variable or a
list. It needs to use the is_XXX guard tests and one
cannot use language constructs that have no representation
in a match_spec (like if, case,
receive etc). The return value from the fun will be
the return value of the resulting match_spec.

The imported variables will be replaced by match_spec
const expressions, which is consistent with the
static scoping for Erlang fun()s. Local or global
function calls can not be in the guard or body of the fun
however. Calls to builtin match_spec functions of course is
allowed:

As you can see by the example, the function can be called from
the shell too. The fun() needs to be literally in the
call when used from the shell as well. Other means than the
parse_transform are used in the shell case, but more or less
the same restrictions apply (the exception being records,
as they are not handled by the shell).

Warning

If the parse_transform is not applied to a module which calls this
pseudo function, the call will fail in runtime (with a
badarg). The module dbg actually exports a
function with this name, but it should never really be called
except for when using the function in the shell. If the
parse_transform is properly applied by including
the ms_transform.hrl header file, compiled code
will never call the function, but the function call is
replaced by a literal match_spec.

More information is provided by the ms_transform
manual page in STDLIB.

Traces Item in accordance to the value specified
by Flags. The variation of Item is listed below:

If the Item is a pid(), the corresponding
process is traced. The process may be a remote process
(on another Erlang node). The node must be in the list of
traced nodes (seen/1 and tracer/0/2/3).

If the Item is the atom all, all processes in the
system as well as all processes created hereafter are
to be traced. This also affects all nodes added with the
n/1 or tracer/0/2/3 function.

If the Item is the atom new, no currently existing
processes are affected, but every process created after the
call is.This also affects all nodes added with the
n/1 or tracer/0/2/3 function.

If the Item is the atom existing, all
existing processes are traced, but new processes will not
be affected.This also affects all nodes added with the
n/1 or tracer/0/2/3 function.

If the Item is an atom other than all,
new or existing, the process with the
corresponding registered name is traced.The process may be a
remote process (on another Erlang node). The node must be added
with the n/1 or tracer/0/2/3 function.

If the Item is an integer, the process <0.Item.0> is
traced.

If the Item is a tuple {X, Y, Z}, the
process <X.Y.Z> is
traced.

If the Item is a string "<X.Y.Z>"
as returned from pid_to_list/1, the process
<X.Y.Z> is traced.

Flags can be a single atom,
or a list of flags. The available flags are:

s (send)

Traces the messages the process sends.

r (receive)

Traces the messages the process receives.

m (messages)

Traces the messages the process receives and sends.

c (call)

Traces global function calls for the process
according to the trace patterns set in the system (see tp/2).

p (procs)

Traces process related events to the process.

sos (set on spawn)

Lets all processes created by the traced
process inherit the trace flags
of the traced process.

sol (set on link)

Lets another process, P2, inherit the
trace flags of the traced
process whenever the traced process links to P2.

sofs (set on first spawn)

This is the same as sos, but only
for the first process spawned by the traced process.

sofl (set on first link)

This is the same as sol, but only for
the first call to
link/1 by the traced process.

all

Sets all flags.

clear

Clears all flags.

The list can also include any of the flags allowed in
erlang:trace/3

The function returns either an error tuple or a tuple
{ok, List}. The List consists of
specifications of how many processes that matched (in the
case of a pure pid() exactly 1). The specification of
matched processes is {matched, Node, N}. If the
remote processor call,rpc, to a remote node fails,
the rpc error message is delivered as a fourth
argument and the number of matched processes are 0. Note
that the result {ok, List} may contain a list where
rpc calls to one, several or even all nodes failed.

This function enables call trace for one or more
functions. All exported functions matching the {Module, Function, Arity} argument will be concerned, but the
match_spec() may further narrow down the set of function
calls generating trace messages.

For a description of the match_spec() syntax,
please turn to the
User's guide part of the online
documentation for the runtime system (erts). The
chapter Match Specification in Erlang explains the
general match specification "language".

The Module, Function and/or Arity parts of the tuple may
be specified as the atom '_' which is a "wild-card"
matching all modules/functions/arities. Note, if the
Module is specified as '_', the Function and Arity
parts have to be specified as '_' too. The same holds for the
Functions relation to the Arity.

All nodes added with n/1 or tracer/0/2/3 will
be affected by this call, and if Module is not '_'
the module will be loaded on all nodes.

The function returns either an error tuple or a tuple
{ok, List}. The List consists of specifications of how
many functions that matched, in the same way as the processes
are presented in the return value of p/2.

There may be a tuple {saved, N} in the return value,
if the MatchSpec is other
than []. The integer N may then be used in
subsequent calls to this function and will stand as an
"alias" for the given expression. There are also built-in
aliases named with atoms (see also ltp/0 below).

If an error is returned, it can be due to errors in
compilation of the match specification. Such errors are
presented as a list of tuples {error, string()} where
the string is a textual explanation of the compilation
error. An example:

(x@y)4> dbg:tp({dbg,ltp,0},[{[],[],[{message, two, arguments}, {noexist}]}]).
{error,
[{error,"Special form 'message' called with wrong number of
arguments in {message,two,arguments}."},
{error,"Function noexist/1 does_not_exist."}]}

This function disables call tracing on the specified
functions. The semantics of the parameter is the same
as for the corresponding function specification in
tp/2 or tpl/2. Both local and global call trace
is disabled.

The return value reflects how many functions that matched,
and is constructed as described in tp/2. No tuple
{saved, N} is however ever returned (for obvious reasons).

Use this function to recall all match specifications previously
used in the session (i. e. previously saved during calls
to tp/2, and built-in match specifications.
This is very useful, as a complicated
match_spec can be quite awkward to write. Note that the
match specifications are lost if stop/0 is called.

Match specifications used can be saved in a file (if a
read-write file system is present) for use in later
debugging sessions, see wtp/1 and rtp/1

Use this function to "forget" all match specifications
saved during calls to tp/2.
This is useful when one wants to restore other match
specifications from a file with rtp/1. Use
dtp/1 to delete specific saved match specifications.

This function will save all match specifications saved
during the session (during calls to tp/2)
and built-in match specifications in a text
file with the name designated by Name. The format
of the file is textual, why it can be edited with an
ordinary text editor, and then restored with
rtp/1.

Each match spec in the file ends with a full stop
(.) and new (syntactically correct) match
specifications can be added to the file manually.

The function returns ok or an error tuple where the
second element contains the I/O error that made the
writing impossible.

This function reads match specifications from a file
(possibly) generated by the wtp/1 function. It checks
the syntax of all match specifications and verifies that
they are correct. The error handling principle is "all or
nothing", i. e. if some of the match specifications are
wrong, none of the specifications are added to the list of
saved match specifications for the running system.

The match specifications in the file are merged
with the current match specifications, so that no duplicates
are generated. Use ltp/0 to see what numbers were
assigned to the specifications from the file.

The function will return an error, either due to I/O
problems (like a non existing or non readable file) or due
to file format problems. The errors from a bad format file
are in a more or less textual format, which will give a hint
to what's causing the problem.

The dbg server keeps a list of nodes where tracing
should be performed. Whenever a tp/2 call or a
p/2 call is made, it is executed for all nodes in this
list including the local node (except for p/2 with a
specific pid() as first argument, in which case the
command is executed only on the node where the designated
process resides).

This function adds a remote node (Nodename) to the
list of nodes where tracing is performed. It starts a tracer
process on the remote node, which will send all trace messages
to the tracer process on the local node (via the Erlang
distribution). If no tracer process is running on the local
node, the error reason no_local_tracer is returned. The
tracer process on the local node must be started with the
tracer/0/2 function.

If Nodename is the local node, the error reason
cant_add_local_node is returned.

If a trace port (seetrace_port/2) is
running on the local node, remote nodes can not be traced with
a tracer process. The error reason
cant_trace_remote_pid_to_local_port is returned. A
trace port can however be started on the remote node with the
tracer/3 function.

The function will also return an error if the node
Nodename is not reachable.

This function starts a server on the local node that will
be the recipient of all trace messages. All subsequent calls
to p/2 will result in messages sent to the newly
started trace server.

A trace server started in this way will simply display the
trace messages in a formatted way in the Erlang shell
(i. e. use io:format). See tracer/2 for a description
of how the trace message handler can be customized.

This function starts a tracer server with additional
parameters on the local node. The first parameter, the
Type, indicates if trace messages should be handled
by a receiving process (process) or by a tracer port
(port). For a description about tracer ports see
trace_port/2.

If Type is a process, a message handler function can
be specified (HandlerSpec). The handler function, which
should be a fun taking two arguments, will be called
for each trace message, with the first argument containing the
message as it is and the second argument containing the return
value from the last invocation of the fun. The initial value
of the second parameter is specified in the InitialData
part of the HandlerSpec. The HandlerFun may
chose any appropriate action to take when invoked, and can
save a state for the next invocation by returning it.

If Type is a port, then the second parameter should
be a fun which takes no arguments and returns a
newly opened trace port when called. Such a fun is
preferably generated by calling trace_port/2.

If an error is returned, it can either be due to a tracer
server already running ({error,already_started}) or
due to the HandlerFun throwing an exception.

This function is equivalent to tracer/2, but acts on
the given node. A tracer is started on the node
(Nodename) and the node is added to the list of traced
nodes.

Note

This function is not equivalent to n/1. While
n/1 starts a process tracer which redirects all trace
information to a process tracer on the local node (i.e. the
trace control node), tracer/3 starts a tracer of any
type which is independent of the tracer on the trace control
node.

This function creates a trace port generating fun.
The fun takes no arguments and returns a newly opened
trace port. The return value from this function is suitable as
a second parameter to tracer/2, i. e. dbg:tracer(port, dbg:trace_port(ip, 4711)).

A trace port is an
Erlang port to a dynamically linked in driver that handles
trace messages directly, without the overhead of sending them
as messages in the Erlang virtual machine.

Two trace drivers are currently implemented, the
file and the ip trace drivers. The file driver
sends all trace messages into one or several binary files,
from where they later can be fetched and processed with the
trace_client/2 function. The ip driver opens a TCP/IP
port where it listens for connections. When a client
(preferably started by calling trace_client/2 on
another Erlang node) connects, all trace messages are sent
over the IP network for further processing by the remote
client.

Using a trace port significantly lowers the overhead
imposed by using tracing.

The file trace driver expects a filename or a wrap files
specification as parameter. A file is written with a high
degree of buffering, why all trace messages are not
guaranteed to be saved in the file in case of a system
crash. That is the price to pay for low tracing overhead.

A wrap files specification is used to limit the disk
space consumed by the trace. The trace is written to a
limited number of files each with a limited size.
The actual filenames are Filename ++ SeqCnt ++ Suffix, where SeqCnt counts as a decimal string
from 0 to WrapCnt and then around again from
0. When a trace term written to
the current file makes it longer than WrapSize,
that file is closed, if the number of files in this
wrap trace is as many as WrapCnt the oldest file
is deleted then a new file is opened to become the current.
Thus, when a wrap trace has been stopped, there are at most
WrapCnt trace files saved with a size of at least
WrapSize (but not much bigger), except for
the last file that might even be empty. The default values
are WrapSize = 128*1024 and WrapCnt = 8.

The SeqCnt values in the filenames are all in the
range 0 through WrapCnt with a gap in the
circular sequence. The gap is needed to find the end of the
trace.

If the WrapSize is specified as
{time, WrapTime}, the current file is closed when it
has been open more than WrapTime milliseconds,
regardless of it being empty or not.

The ip trace driver has a queue of QueSize messages
waiting to be delivered. If the driver cannot deliver messages
as fast as they are produced by the runtime system, a special
message is sent, which indicates how many messages that are
dropped. That message will arrive at the handler function
specified in trace_client/3 as the tuple {drop, N} where N is the number of consecutive messages
dropped. In case of heavy tracing, drop's are likely to occur,
and they surely occur if no client is reading the trace
messages.

This function is used to do a control operation on the
active trace port driver on the given node
(Nodename). Which operations that are allowed as well
as their return values are depending on which trace driver
that is used.

Returns either ok or {ok, Result}
if the operation was successful, or {error, Reason}
if the current tracer is a process
or if it is a port not supporting the operation.

The allowed values for Operation are:

flush

This function is used to flush the internal buffers
held by a trace port driver. Currently only the
file trace driver supports this operation.
Returns ok.

get_listen_port

Returns {ok, IpPort} where IpPortis
the IP port number used by the driver listen socket.
Only the ip trace driver supports this operation.

This function starts a trace client that reads the output
created by a trace port driver and handles it in mostly the
same way as a tracer process created by the tracer/0
function.

If Type is file, the client reads all trace
messages stored in the file named Filename or
specified by WrapFilesSpec (must be the same as used
when creating the trace, see trace_port/2)
and let's the default handler function format the
messages on the console. This is one way to interpret the data
stored in a file by the file trace port driver.

If Type is follow_file, the client behaves as
in the file case, but keeps trying to read (and
process) more data
from the file until stopped by stop_trace_client/1.
WrapFilesSpec is not allowed as second argument
for this Type.

If Type is ip, the client connects to the
TCP/IP port PortNumber on the host Hostname,
from where it reads trace messages until the TCP/IP connection
is closed. If no Hostname is specified, the local host
is assumed.

As an example, one can let trace messages be sent over the
network to another Erlang node (preferably not
distributed), where the formatting occurs:

On the node stack there's an Erlang node
ant@stack, in the shell, type the following:

All trace messages are now sent to the trace port driver,
which in turn listens for connections on the TCP/IP port
4711. If we want to see the messages on another node,
preferably on another host, we do like this:

-> dbg:trace_client(ip, {"stack", 4711}).
<0.42.0>

If we now send a message from the shell on the node
ant@stack, where all sends from the shell are traced:

ant@stack> self() ! hello.
hello

The following will appear at the console on the node that
started the trace client:

This function works exactly as trace_client/2, but
allows you to write your own handler function. The handler
function works mostly as the one described in
tracer/2, but will also have to be prepared to handle
trace messages of the form {drop, N}, where N is
the number of dropped messages. This pseudo trace message will
only occur if the ip trace driver is used.

For trace type file, the pseudo trace message
end_of_trace will appear at the end of the trace. The
return value from the handler function is in this case
ignored.

Advanced topics - combining with seq_trace

The dbg module is primarily targeted towards
tracing through the erlang:trace/3 function. It is
sometimes desired to trace messages in a more delicate way, which
can be done with the help of the seq_trace module.

seq_trace implements sequential tracing (known in the
AXE10 world, and sometimes called "forlopp tracing"). dbg
can interpret messages generated from seq_trace and the
same tracer function for both types of tracing can be used. The
seq_trace messages can even be sent to a trace port for
further analysis.

As a match specification can turn on sequential tracing, the
combination of dbg and seq_trace can be quite
powerful. This brief example shows a session where sequential
tracing is used:

This session sets the system_tracer to the same process as
the ordinary tracer process (i. e. <0.31.0>) and sets the
trace pattern for the function dbg:get_tracer to one that
has the action of setting a sequential token. When the function
is called by a traced process (all processes are traced in this
case), the process gets "contaminated" by the token and
seq_trace messages are sent both for the server request
and the response. The seq_trace:set_token([]) after the
call clears the seq_trace token, why no messages are sent
when the answer propagates via the shell to the console port.
The output would otherwise have been more noisy.

Note of caution

When tracing function calls on a group leader process (an IO process), there is risk
of causing a deadlock. This will happen if a group leader process generates a trace
message and the tracer process, by calling the trace handler function, sends an IO
request to the same group leader. The problem can only occur if the trace handler
prints to tty using an io function such as format/2. Note that when
dbg:p(all,call) is called, IO processes are also traced.
Here's an example:

The reason we get a deadlock in the first example is because when TAB is pressed
to expand the function name, the group leader (which handles character input) calls
mymod:module_info(). This generates a trace message which, in turn, causes the
tracer process to send an IO request to the group leader (by calling io:format/2).
We end up in a deadlock.

In the second example we use the default trace handler function. This handler
prints to tty by sending IO requests to the user process. When Erlang is
started in oldshell mode, the shell process will have user as its
group leader and so will the tracer process in this example. Since user calls
functions in lists we end up in a deadlock as soon as the first IO request is sent.

Here are a few suggestions for how to avoid deadlock:

Don't trace the group leader of the tracer process. If tracing has been switched on
for all processes, call dbg:p(TracerGLPid,clear) to stop tracing the group leader
(TracerGLPid). process_info(TracerPid,group_leader) tells you
which process this is (TracerPid is returned from dbg:get_tracer/0).

Don't trace the user process if using the default trace handler function.

In your own trace handler function, call erlang:display/1 instead of an
io function or, if user is not used as group leader, print to
user instead of the default group leader. Example:
io:format(user,Str,Args).