inviso

Main API Module to the Inviso Tracer

With the inviso API runtime components can be started and tracing managed across a network of distributed Erlang nodes, using a control component also started with inviso API functions.

Inviso can be used both in a distributed environment and in a non-distributed. API functions not taking a list of nodes as argument works on all started runtime components. If it is the non-distributed case, that is the local runtime component. The API functions taking a list of nodes as argument, or as part of one of the arguments, can not be used in a non-distributed environment. Return values named NodeResult refers to return values from a single Erlang node, and will therefore be the return in the non-distributed environment.

Functions

start() -> {ok,pid()} | {error,Reason}

start(Options) -> {ok,pid()} | {error,Reason}

Options = [Option]

Options may contain both options which will be default options to a runtime component when started, and options to the control component. See add_nodes/3 for details on runtime component options. The control component recognizes the following options:

{subscribe,Pid}

Making the process Pid receive Inviso events from the control component.

Starts a control component process on the local node. A control component must be started before runtime components can be started manually or otherwise accessed through the inviso API.

stop() -> shutdown

Stops the control component. Runtime components are left as is. They will behave according to their dependency values.

add_node(RTtag) -> NodeResult | {error,Reason}

add_node(RTtag,Options) -> NodeResult | {error,Reason}

RTtag = PreviousRTtag = term()

Options = [Option]

Option -- see below

Option = {dependency,Dep}

Dep = int() | infinity

The timeout, in milliseconds, before the runtime component will terminate if abandoned by thiscontrol component.

Option = {overload,Overload} | overload

Controls how and how often overload checks shall be performed. Just overloadspecifies that no loadcheck shall be performed.

Overload = Interval | {LoadMF,Interval,InitMFA,RemoveMFA}

LoadMF = {Mod,Func} | function()/1

Interval = int() | infinity

Interval is the time in milliseconds between overload checks.

InitMFA = RemoveMFA = {Mod,Func,ArgList} | void

When starting up the runtime component or when changing options (see change_options/2) the overload mechanism is initialized with a call to the InitMFAfunction. It shall return LoadCheckData. Every time a load check is performed, LoadMFis called with LoadCheckDataas its only argument. LoadMFshall return okor {suspend,Reason}. When the runtime component is stopped or made to change options involving changing overload-check, the RemoveMFAfunction is called. Its return value is discarded.

NodeResult = {ok,NAns} | {error,Reason}

NAns = new | {adopted,State,Status,PreviousRTtag} | already_added

State = new | tracing | idle

Status = running | {suspended,SReason}

Starts or tries to connect to an existing runtime component at the local node, regardless if the system is distributed or not. Options will override any default options specified at start-up of the control component.

The PreviousRTtag can indicate if the incarnation of the runtime component at the node in question was started by "us" and then can be expected to do tracing according to "our" instructions or not.

add_nodes_if_ref(Nodes,RTtag) -> NodeResult | {error,Reason}

add_nodes_if_ref(Nodes,RTtag,Options) -> NodeResult | {error,Reason}

stop_nodes() -> {ok,NodeResults} | NodeResult

stop_nodes(Nodes) -> {ok,NodeResults} | {error,Reason}

NodeResults = [{Node,NodeResult}]

NodeResult = ok | {error,Reason}

Stops runtime component on Nodes. stop_nodes/0 will if the control component is running on a distributed node stop all runtime components. And if running on a non distributed node, stop the local and only runtime component.

Starts the tracing at the specified nodes, meaning that the runtime components transits from the state new or idle to tracing. For trace messages to be generated, there must of course also be trace pattern and/or trace flags set. Such can not be set before tracing has been initiated with init_tracing/1,2.

TracerData controls how the runtime component will handle generated trace messages. The trace tag controls how regular trace messages are handled. The ti tag controls if and how trace information will be stored and the meta tracer will be activated. That is if ti is omitted, no meta tracer will be started as part of the runtime component. It is possible to have ti without trace, but most likely not useful.

The ip and file trace tracerdata instructions results in using the built in trace ip-port and file-port respectively. relayer will result in that all regular trace messages are forwarded to a runtime component at the specified node. Using a HandlerFun will result in that every incoming regular trace message is applied to the HandlerFun. collector can be used to use this runtime component to receive relayed trace messages and print them to the shell.

The trace information can be configured to either write trace information to a plain trace information file or to relay it to another inviso meta tracer on another node. The inviso meta tracer is capable of matching function calls with their function returns (only if return_trace is activated in the meta trace match specification for the function in question). This is necessary since it may not be possible to decide what to do, if anything shall be done at all, until the return value of the function call is examined.

To be able to match calls with returns a state can be saved when detecting a function call in a public loop data structure kept by the inviso meta tracer. The public loop data structure is given as argument to a handler-function called whenever a meta trace message arrives to the inviso meta tracer (both function calls and function returns). The public loop data structure is first initiated by the Mi:Fi function which takes the items in Argsi as arguments. Fi shall return the initial public loop data structure. When meta tracing is stopped, either because tracing is stopped or because tracing is suspended, the Mr:Fr(PublicLoopData) is called to offer a possibility to clean-up. Note that for every function meta-tracing is activated, a public loop data modification function can be specified. That function will prepare the current loop data structure for this particular function.

Further there is a risk that function call states becomes abandoned inside the public loop data structure. This will happen if a function call is entered into the public loop data structure, but no function return occurs. To prevent the public loop data structure from growing infinitely the clean function Fc will periodically be called with the public loop data structure as argument. Elements entered into the public loop data structure as a result of a function call must contain a timestamp for the Fc to be able to conclude if it is abandoned or not. Fc shall return a new public loop data structure.

When initiating tracing involving trace information without a TiSpec, a default public loop data structure will be initiated to handle locally registered process aliases. The default public loop data structure is a two-tuple where the first element is used by the meta tracing on the BIF register/2. The second element is left for user usage.

The default public loop data structure may be extended with more element positions. The first position must be left to the implementation of registered-name translations. If the public loop data structure is changed no longer meeting this requirement, the tpm_localnames/0,1 and tpm_globalnames/0,1 can no longer be used.

A wrap files specification is used to limit the disk space consumed by the trace. The trace is written to a limited number of files each with a limited size. The actual filenames are Filename ++ SeqCnt ++ Tail, where SeqCnt counts as a decimal string from 0 to WrapCnt and then around again from 0. When a trace message written to the current file makes it longer than WrapSize, that file is closed, if the number of files in this wrap trace is as many as WrapCnt the oldest file is deleted then a new file is opened to become the current. Thus, when a wrap trace has been stopped, there are at most WrapCnt trace files saved with a size of at least WrapSize (but not much bigger), except for the last file that might even be empty. The default values are WrapSize == 128*1024 and WrapCnt == 8.

The SeqCnt values in the filenames are all in the range 0 through WrapCnt with a gap in the circular sequence. The gap is needed to find the end of the trace.

If the WrapSize is specified as {time,WrapTime}, the current file is closed when it has been open more than WrapTime milliseconds, regardless of it being empty or not.

The ip trace driver has a queue of QSize messages waiting to be delivered. If the driver cannot deliver messages as fast as they are produced by the runtime system, they are dropped. The number of dropped messages are indicated in the trace log as separate trace message.

stop_tracing(Nodes) -> {ok,NodeResults} | {error,Reason}

stop_tracing() -> {ok,NodeResults} | NodeResult

Nodes = [Node]

NodeResults = [{Node,NodeResult}]

NodeResult = {ok,State} | {error,Reason}

State = new | idle

Stops tracing on all or specified Nodes. Flushes the trace buffer if a trace-port is used, closes the trace-port and removes all trace flags and meta-patterns. The nodes are called in parallel.

Stopping tracing means going to state idle<c>. If the runtime component was already in state <c>new, it will of course remain in state new (then there was no tracing to stop).

clear() -> {ok,NodeResults} | NodeResult

clear(Nodes,Options) -> {ok,NodeResults} | {error,Reason}

clear(Options) -> {ok,NodeResults} | NodeResult | {error,Reason}

Nodes = [Node]

Options = [Option]

Option = keep_trace_patterns | keep_log_files

NodeResults = [{Node,NodeResult}]

NodeResult = {ok,{new,Status}} | {error,Reason}

Status = running | {suspended,SReason}

Stops all tracing including removing meta-trace patterns. Removes all trace patterns. If the node is tracing or idle, trace-logs belonging to the current tracerdata are removed. Hence the node is returned to state new. Note that the node can still be suspended.

Various options can make the node keep set trace patterns and log-files. The node still enters the new state.

Set trace pattern (global) on specified or all nodes. The integer replied if the call was successfully describes the number of matched functions. The functions without a Nodes argument means all nodes, in a non-distributed environment it means the local node. Using wildcards follows the rules for wildcards of erlang:trace_pattern/3. It is for instance illegal to specify M == '_' while F is not '_'.

When calling several nodes, the nodes are called in parallel.

The option only_loaded will prevent modules not loaded (yet) into the runtime system to become loaded just as a result of that a trace pattern is requested to be set on it. Otherwise modules are automatically loaded if not already loaded (since the module must be present for a trace pattern to be set on it). The latter does not apply if the wildcard '_' is used as module specification.

Set process trace flags on processes on all or specified nodes. The integer returned if the call was successful describes the matched number of processes. The functions without a Nodes argument means all nodes, in a non-distributed environment it means the local node.

There are many combinations which does not make much sense. For instance specifying a certain process identifier at all nodes. Or an empty TraceConfList for all nodes.

Initializes Mod:Func/Arity for meta tracing without setting any meta trace patterns. This is necessary if the named match specs will be used (see tpm_ms/5,6). Otherwise initialization of public loop data can be done at the same time as setting meta trace patterns using tpm/8,9.

Note that we can not use wildcards here (even if it is perfectly legal in Erlang). It also sets the CallFunc and ReturnFunc for the meta traced function. That is the functions which will be called when a function call and a return_trace meta trace message respectively arrives to the inviso meta tracer for Mod:Func/Arity.

This function is also available without InitFunc and RemoveFunc. That means that no initialization of the public loop data structure will be done and that CallFunc and ReturnFunc must either use already existing parts of public loop data structure or not use it at all.

The InitFunc initializes the already existing public loop data structure for use with Mod:Func/Arity. InitFunc(Mod,Func,Arity,PublLD) -> {ok,NewPublLD,Output} where OutPut can be a binary which will then be written to the trace information file. If it is not a binary, no output will be done. RemoveFunc will be called when the meta tracing is cleared with ctpm/3,4. RemoveFunc(Mod,Func,Arity,PublLD) -> {ok,NewPublLD}.

Activates meta-tracing in the inviso_rt_meta tracer. Except when using tpm/6, tpm/8 and tpm/9 the Mod:Func/Arity must first have been initiated using init_tpm/N. When calling several nodes, the nodes are called in parallel.

CallFunc will be called every time a meta trace message arrives to the inviso meta tracer because of a call to Func. CallFunc(CallingPid,ActualArgList,PublLD) -> {ok,NewPrivLD,Output} where Output can be a binary or void. If it is a binary it will be written to the trace information file.

ReturnFunc will be called every time a meta return_trace message arrives to the inviso meta tracer because of a return_trace of a call to Func. ReturnFunc(CallingPid,ReturnValue,PublLD) -> {ok,NewPrivLD,Output}. Further the ReturnFunc must handle the fact that a return_trace message arrives for a call which was never noticed. This because the message queue of the meta tracer may have been emptied.

Same as tpm/X but all match specs in MS containing a trace action term will have a {tracer,Tracer} appended to its enable-list. Tracer will be the current output for regular trace messages as specified when tracing was initiated. This function is useful when setting a meta trace pattern on a function with the intent that its execution shall turn tracing on for the process executing the match-spec in the meta trace pattern. The reason the tracer process trace flag can not be explicitly written in the action term by the user is that it may be difficult to learn its exact value for a remote node. Further more inviso functions are made to work on several nodes at the same time, requiring different match specs to be set for different nodes.

Simple example: We want any process executing the function mymod:init(1234) (with the argument, exactly the integer 1234) to begin function-call tracing. In the example, if the process is found to be one that shall start call tracing, we also first disable all process trace flags to ensure that we have full control over what the process traces. void in the example specifies that the meta-tracer (inviso_rt_meta) will not call any function when meta trace messages for mymod:init/1 arrives. There is no need for a CallFunc since the side-effect (start call-tracing) is achieved immediately with the match-spec.

This function adds a list of match-specs to the already existing ones. It uses an internal database to keep track of existing match-specs. This set of match specs can hereafter be referred to with the name MSname. If the match-spec does not result in any meta traced functions (for whatever reason), the MS is not saved in the database. The previously known match-specs are not removed. If MSname is already in use as a name referring to a set of match-specs for this particular meta-traced function, the previous set of match-specs are replaced with MS.

Mod:Func/Arity must previously have been initiated in order for this function to add a match-spec.

When calling several nodes, the nodes are called in parallel. {ok,1} indicates success.

ctpm(Nodes,Mod,Func,Arity) -> {ok,NodeResults} | {error,Reason}

Removes the meta trace pattern for the function, means stops generating output for this function. The public loop data structure may be cleared by the previously entered RemoveFunc.

When calling several nodes, the nodes are called in parallel.

tpm_localnames() -> {ok,NodeResults} | NodeResult | {error,Reason}

tpm_localnames(Nodes) -> {ok,NodeResults} | {error,Reason}

NodeResults = [{Node,NodeResult}]

NodeResult = {R1,R2}

R1 = R2 = {ok,0} | {ok,1} | {error,Reason}

Quick version for setting meta-trace patterns on erlang:register/2. It uses a default CallFunc and ReturnFunc in the meta-tracer server. The main purpose of this function is to create ti-log entries for associations between pids and registered name aliases. The implementation uses return_trace to see if the registration was successful or not, before actually making the ti-log alias entry. Further the implementation also meta traces the BIF unregister/1.

If both N1 and N2 is 1, function call was successful. N1 and N2 represent setting meta trace pattern on register/2 and unregister/1.

ctpm_localnames() -> {ok,NodeResults} | NodeResult | {error,Reason}

ctpm_localnames(Nodes) -> {ok,NodeResults} | {error,Reason}

NodeResults = [{Node,NodeResult}]

NodeResult = {R1,R2}

R1 = R2 = ok | {error,Reason}

Function for removing previously set patters by tpm_localnames/0. The two results R1 and R2 represents that meta pattern is removed from both register/2 and unregister/1.

tpm_globalnames() -> {ok,NodeResults} | NodeResult | {error,Reason}

tpm_globalnames(Nodes) -> {ok,NodeResults} | {error,Reason}

NodeResults = [{Node,NodeResult}]

NodeResult = {R1,R2}

R1 = R2 = {ok,0} | {ok,1} | {error,Reason}

Quick version for setting meta-trace patterns capable of learning the association of a pid with a globally registered name (registered using global:register_name). The implementation meta-traces on global:handle_call({register,'_','_','_'},'_','_') and global:delete_global_name/2. The N1 and N2 represents the success of the two sub-tmp calls.

ctpm_globalnames() -> {ok,NodeResults} | NodeResult | {error,Reason}

ctpm_globalnames(Nodes) -> {ok,NodeResults} | {error,Reason}

NodeResults = [{Node,NodeResult}]

NodeResult = {R1,R2} | {error,Reason}

R1 = R2 = ok | {error,Reason}

Function for removing previously set meta patters by tpm_globalnames/0,1. The two results R1 and R2 represents that meta pattern are removed from both global:handle_call/3 and global:delete_global_name/1.

ctp_all() -> {ok,NodeResults} | NodeResult | {error,Reason}

ctp_all(Nodes) -> {ok,NodeResults} | {error,Reason}

NodeResults = [{Node,NodeResult}]

NodeResult = ok | {error,Reason}

Clears all, both global and local trace patterns. Does not clear meta trace patterns. Equivalent to a call to ctp/3,4 and to ctpl/3,4 with wildcards '_' for all modules, functions and arities.

suspend(SReason) -> {ok,NodeResults} | NodeResult | {error,Reason}

suspend(Nodes,SReason) -> {ok,NodeResults} | {error,Reason}

SReason = term()

NodeResults = [{Node,NodeResult}]

NodeResult = ok | {error,Reason}

Suspends the runtime components. SReason will become the suspend-reason replied in for instance a get_status/0,1 call. A runtime component that becomes suspended removes all trace flags and all meta trace patterns. In that way trace output is no longer generated. The task of reactivating a suspended runtime component is outside the scoop of inviso. It can for instance be implemented by a higher layer trace-tool "remembering" all trace flags and meta patterns set.

cancel_suspension() -> {ok,NodeResults} | NodeResult | {error,Reason}

cancel_suspend(Nodes) -> {ok,NodeResults} | {error,Reason}

NodeResults = [{Node,NodeResult}]

NodeResult = ok | {error,Reason}

Makes the runtime components running again (as opposite to suspended). Since reactivating previous trace flags and meta trace patterns is outside the scoop of inviso, cancelling suspension is simply making it possible to set trace flags and meta trace patterns again.

get_status() -> {ok,NodeResults} | NodeResult | {error,Reason}

get_status(Nodes) -> {ok,NodeResults} | {error,Reason}

NodeResults = [{Node,NodeResult}]

NodeResult = {ok,{State,Status}} | {error,Reason}

State = new | idle | tracing

Status = running | {suspended,SReason}

SReason = term()

Finds out the state and status of a runtime component. A runtime component is in state new before it has been initiated to do any tracing the first time. There are clear-functions which can make a runtime component become new again without having to restart. A runtime component becomes idle after tracing is stopped.

get_tracerdata() -> {ok,NodeResults} | NodeResult | {error,Reason}

get_tracerdata(Nodes) -> {ok,NodeResults} | {error,Reason}

NodeResults = [{Node,NodeResult}]

NodeResult = {ok,NResult} | {error,Reason}

NResult = TracerData | no_tracerdata

Returns the current tracerdata of a runtime component. A runtime component in state new can not have tracerdata. An idle runtime component does have tracerdata, the last active tracerdata. TracerData will be a term as specified to init_tracing when tracing was initiated for the runtime component.

Returns the actually existing log files associated with TracerData. If a tracerdata is not specified, current tracerdata is used for that particular runtime component. Files will be a list of one or more files should it be a wrap-set. Otherwise the it is a list of only one filename.

This function is useful to learn the name and path of all files belonging to a trace. This information can later be used to move those files for merging. Note that since it is possible to ask on other tracerdata than the current, it is possible to learn filenames of previously done traces, under the circumstances that they have not been removed.

Copies log files over distributed erlang to the control component node. This function can only be used in a distributed system.

The resulting transferred files will have the prefix Prefix and will be located in DestDir. The source files can either be pointed out using a FileListSpec or tracerdata. If no files are explicitly specified, current tracerdata for that node will be used. Note that if source files have the same name (on several nodes) they will overwrite each other at DestDir.

Deletes listed files or files corresponding to tracerdata. If no tracerdata or list of files are specified in the call, current tracerdata at the runtime components will be used to identify files to delete. All filenames shall be strings.

FileName can either be an absolute path or just a filename depending on if AbsPathFileName or a LogSpec was used to identify the file.

subscribe() -> ok | {error,Reason}

subscribe(Pid) -> ok | {error,Reason}

Pid = pid()

Adds Pid or self() if using subscribe/0 to the inviso-event sending list. Note that it is possible to add a pid several times and that the Pid then will receive multiple copies of inviso-event messages.

Subscribing to inviso-event may be necessary for a higher layer trace-tool using inviso to follow the runtime components. local_runtime will be used for a runtime component running in a non-distributed environment.