erl_syntax

Abstract Erlang syntax trees.

Abstract Erlang syntax trees.

This module defines an abstract data type for representing Erlang
source code as syntax trees, in a way that is backwards compatible
with the data structures created by the Erlang standard library
parser module erl_parse (often referred to as "parse
trees", which is a bit of a misnomer). This means that all
erl_parse trees are valid abstract syntax trees, but the
reverse is not true: abstract syntax trees can in general not be used
as input to functions expecting an erl_parse tree.
However, as long as an abstract syntax tree represents a correct
Erlang program, the function revert/1 should be able to
transform it to the corresponding erl_parse
representation.

A recommended starting point for the first-time user is the
documentation of the syntaxTree() data type, and
the function type/1.

NOTES:

This module deals with the composition and decomposition of
syntactic entities (as opposed to semantic ones); its
purpose is to hide all direct references to the data structures used
to represent these entities. With few exceptions, the functions in
this module perform no semantic interpretation of their inputs, and
in general, the user is assumed to pass type-correct arguments - if
this is not done, the effects are not defined.

With the exception of the erl_parse data structures,
the internal representations of abstract syntax trees are subject to
change without notice, and should not be documented outside this
module. Furthermore, we do not give any guarantees on how an abstract
syntax tree may or may not be represented, with the following
exceptions: no syntax tree is represented by a single atom, such
as none, by a list constructor [X | Y], or
by the empty list []. This can be relied on when writing
functions that operate on syntax trees.

DATA TYPES

erl_parse() = parse_tree() (see module erl_parse)

The "parse tree"
representation built by the Erlang standard library parser
erl_parse. This is a subset of the
syntaxTree type.

syntaxTree()

An abstract syntax tree. The
erl_parse "parse tree" representation is a subset of the
syntaxTree() representation.

Every abstract syntax tree node has a type, given by the
function type/1. Each node also
has associated attributes; see get_attrs/1 for details. The
functions make_tree/2 and subtrees/1 are generic
constructor/decomposition functions for abstract syntax trees. The
functions abstract/1 and concrete/1 convert between
constant Erlang terms and their syntactic representations. The set of
syntax tree nodes is extensible through the tree/2 function.

A syntax tree can be transformed to the erl_parse
representation with the revert/1
function.

syntaxTreeAttributes()

This is an abstract representation of
syntax tree node attributes; see the function get_attrs/1.

Functions

abstract(Term::term()) -> syntaxTree()

Returns the syntax tree corresponding to an Erlang term.
Term must be a literal term, i.e., one that can be
represented as a source code literal. Thus, it may not contain a
process identifier, port, reference, binary or function value as a
subterm. The function recognises printable strings, in order to get a
compact and readable representation. Evaluation fails with reason
badarg if Term is not a literal term.

Creates an abstract function application expression. If
Module is none, this is call is equivalent
to application(Function, Arguments), otherwise it is
equivalent to application(module_qualifier(Module, Function),
Arguments).

atom_value(Node::syntaxTree()) -> atom()

attribute(Name) -> syntaxTree()

attribute(Name::syntaxTree(), Args::Arguments) -> syntaxTree()

Arguments = none | [syntaxTree()]

Creates an abstract program attribute. If
Arguments is [A1, ..., An], the result
represents "-Name(A1, ...,
An).". Otherwise, if Arguments is
none, the result represents
"-Name.". The latter form makes it possible
to represent preprocessor directives such as
"-endif.". Attributes are source code forms.

Note: The preprocessor macro definition directive
"-define(Name, Body)." has relatively
few requirements on the syntactical form of Body (viewed
as a sequence of tokens). The text node type can be used
for a Body that is not a normal Erlang construct.

catch_expr_body(Node::syntaxTree()) -> syntaxTree()

char(Value::char()) -> syntaxTree()

Creates an abstract character literal. The result represents
"$Name", where Name corresponds to
Value.

Note: the literal corresponding to a particular character value is
not uniquely defined. E.g., the character "a" can be
written both as "$a" and "$\141", and a Tab
character can be written as "$\11", "$\011"
or "$\t".

comment(Strings) -> syntaxTree()

comment(Pad::Padding, Strings::[string()]) -> syntaxTree()

Creates an abstract comment with the given padding and text. If
Strings is a (possibly empty) list
["Txt1", ..., "TxtN"], the result
represents the source code text

%Txt1
...
%TxtN

Padding states the number of empty character positions
to the left of the comment separating it horizontally from
source code on the same line (if any). If Padding is
none, a default positive number is used. If
Padding is an integer less than 1, there should be no
separating space. Comments are in themselves regarded as source
program forms.

comment_text(Node::syntaxTree()) -> [string()]

compact_list(Node::syntaxTree()) -> syntaxTree()

Yields the most compact form for an abstract list skeleton. The
result either represents "[E1, ..., En |
Tail]", where Tail is not a list
skeleton, or otherwise simply "[E1, ...,
En]". Annotations on subtrees of Node
that represent list skeletons may be lost, but comments will be
propagated to the result. Returns Node itself if
Node does not represent a list skeleton.

concrete(Node::syntaxTree()) -> term()

Returns the Erlang term represented by a syntax tree. Evaluation
fails with reason badarg if Node does not
represent a literal term.

Note: Currently, the set of syntax trees which have a concrete
representation is larger than the set of trees which can be built
using the function abstract/1. An abstract character
will be concretised as an integer, while abstract/1 does
not at present yield an abstract character for any input. (Use the
char/1 function to explicitly create an abstract
character.)

conjunction_body(Node::syntaxTree()) -> [syntaxTree()]

cons(Head::syntaxTree(), Tail::syntaxTree()) -> syntaxTree()

"Optimising" list skeleton cons operation. Creates an abstract
list skeleton whose first element is Head and whose tail
corresponds to Tail. This is similar to
list([Head], Tail), except that Tail may
not be none, and that the result does not necessarily
represent exactly "[Head | Tail]", but
may depend on the Tail subtree. E.g., if
Tail represents [X, Y], the result may
represent "[Head, X, Y]", rather than
"[Head | [X, Y]]". Annotations on
Tail itself may be lost if Tail represents
a list skeleton, but comments on Tail are propagated to
the result.

disjunction_body(Node::syntaxTree()) -> [syntaxTree()]

eof_marker() -> syntaxTree()

Creates an abstract end-of-file marker. This represents the
end of input when reading a sequence of source code forms. An
end-of-file marker is itself regarded as a source code form
(namely, the last in any sequence in which it occurs). It has no
defined lexical form.

Note: this is retained only for backwards compatibility with
existing parsers and tools.

error_marker(Error::term()) -> syntaxTree()

Creates an abstract error marker. The result represents an
occurrence of an error in the source code, with an associated Erlang
I/O ErrorInfo structure given by Error (see module
io(3) for details). Error markers are regarded as source
code forms, but have no defined lexical form.

Note: this is supported only for backwards compatibility with
existing parsers and tools.

get_ann(Tree::syntaxTree()) -> [term()]

get_attrs(Tree::syntaxTree()) -> syntaxTreeAttributes()

Returns a representation of the attributes associated with a
syntax tree node. The attributes are all the extra information that
can be attached to a node. Currently, this includes position
information, source code comments, and user annotations. The result
of this function cannot be inspected directly; only attached to
another node (cf. set_attrs/2).

For accessing individual attributes, see get_pos/1,
get_ann/1, get_precomments/1 and
get_postcomments/1.

get_pos(Node::syntaxTree()) -> term()

Returns the position information associated with
Node. This is usually a nonnegative integer (indicating
the source code line number), but may be any term. By default, all
new tree nodes have their associated position information set to the
integer zero.

get_postcomments(Tree::syntaxTree()) -> [syntaxTree()]

Returns the associated post-comments of a node. This is a
possibly empty list of abstract comments, in top-down textual order.
When the code is formatted, post-comments are typically displayed to
the right of and/or below the node. For example:

{foo, X, Y} % Post-comment of tuple

If possible, the comment should be moved past any following
separator characters on the same line, rather than placing the
separators on the following line. E.g.:

get_precomments(Tree::syntaxTree()) -> [syntaxTree()]

Returns the associated pre-comments of a node. This is a
possibly empty list of abstract comments, in top-down textual order.
When the code is formatted, pre-comments are typically displayed
directly above the node. For example:

% Pre-comment of function
foo(X) -> {bar, X}.

If possible, the comment should be moved before any preceding
separator characters on the same line. E.g.:

Creates an abstract module-qualified "implicit fun" expression.
If Module is none, this is equivalent to
implicit_fun(Name, Arity), otherwise it is equivalent to
implicit_fun(module_qualifier(Module, arity_qualifier(Name,
Arity)).

Note: not all literals are leaf nodes, and vice versa. E.g.,
tuples with nonzero arity and nonempty lists may be literals, but are
not leaf nodes. Variables, on the other hand, are leaf nodes but not
literals.

is_literal(Node::syntaxTree()) -> boolean()

is_proper_list(Node::syntaxTree()) -> boolean()

Returns true if Node represents a
proper list, and false otherwise. A proper list is a
list skeleton either on the form "[]" or
"[E1, ..., En]", or "[... |
Tail]" where recursively Tail also
represents a proper list.

Note: Since Node is a syntax tree, the actual
run-time values corresponding to its subtrees may often be partially
or completely unknown. Thus, if Node represents e.g.
"[... | Ns]" (where Ns is a variable), then
the function will return false, because it is not known
whether Ns will be bound to a list at run-time. If
Node instead represents e.g. "[1, 2, 3]" or
"[A | []]", then the function will return
true.

list(List) -> syntaxTree()

list(Elements::List, Tail) -> syntaxTree()

List = [syntaxTree()]

Tail = none | syntaxTree()

Constructs an abstract list skeleton. The result has type
list or nil. If List is a
nonempty list [E1, ..., En], the result has type
list and represents either "[E1, ...,
En]", if Tail is none, or
otherwise "[E1, ..., En |
Tail]". If List is the empty list,
Tailmust be none, and in that
case the result has type nil and represents
"[]" (cf. nil/0).

The difference between lists as semantic objects (built up of
individual "cons" and "nil" terms) and the various syntactic forms
for denoting lists may be bewildering at first. This module provides
functions both for exact control of the syntactic representation as
well as for the simple composition and deconstruction in terms of
cons and head/tail operations.

Note: in list(Elements, none), the "nil" list
terminator is implicit and has no associated information (cf.
get_attrs/1), while in the seemingly equivalent
list(Elements, Tail) when Tail has type
nil, the list terminator subtree Tail may
have attached attributes such as position, comments, and annotations,
which will be preserved in the result.

list_prefix(Node::syntaxTree()) -> [syntaxTree()]

list_suffix(Node::syntaxTree()) -> none | syntaxTree()

Returns the suffix subtree of a list node, if one
exists. If Node represents "[E1, ...,
En | Tail]", the returned value is
Tail, otherwise, i.e., if Node represents
"[E1, ..., En]", none is
returned.

Note that even if this function returns some Tail
that is not none, the type of Tail can be
nil, if the tail has been given explicitly, and the list
skeleton has not been compacted (cf.
compact_list/1).

list_tail(Node::syntaxTree()) -> syntaxTree()

Returns the tail of a list node. If
Node represents a single-element list
"[E]", then the result has type
nil, representing "[]". If
Node represents "[E1, E2
...]", the result will represent "[E2
...]", and if Node represents
"[Head | Tail]", the result will
represent "Tail".

macro(Name) -> syntaxTree()

macro(Name::syntaxTree(), Arguments) -> syntaxTree()

Creates an abstract macro application. If Arguments
is none, the result represents
"?Name", otherwise, if Arguments
is [A1, ..., An], the result represents
"?Name(A1, ..., An)".

Notes: if Arguments is the empty list, the result
will thus represent "?Name()", including a pair
of matching parentheses.

The only syntactical limitation imposed by the preprocessor on the
arguments to a macro application (viewed as sequences of tokens) is
that they must be balanced with respect to parentheses, brackets,
begin ... end, case ... end, etc. The
text node type can be used to represent arguments which
are not regular Erlang constructs.

macro_name(Node::syntaxTree()) -> syntaxTree()

make_tree(Type::atom(), Groups::[[syntaxTree()]]) -> syntaxTree()

Creates a syntax tree with the given type and subtrees.
Type must be a node type name (cf. type/1)
that does not denote a leaf node type (cf. is_leaf/1).
Groups must be a nonempty list of groups of
syntax trees, representing the subtrees of a node of the given type,
in left-to-right order as they would occur in the printed program
text, grouped by category as done by subtrees/1.

The result of copy_attrs(Node, make_tree(type(Node),
subtrees(Node))) (cf. update_tree/2) represents
the same source code text as the original Node, assuming
that subtrees(Node) yields a nonempty list. However, it
does not necessarily have the same data representation as
Node.

match_expr_body(Node::syntaxTree()) -> syntaxTree()

match_expr_pattern(Node::syntaxTree()) -> syntaxTree()

meta(Tree::syntaxTree()) -> syntaxTree()

Creates a meta-representation of a syntax tree. The result
represents an Erlang expression "MetaTree"
which, if evaluated, will yield a new syntax tree representing the
same source code text as Tree (although the actual data
representation may be different). The expression represented by
MetaTree is implementation independent with
regard to the data structures used by the abstract syntax tree
implementation. Comments attached to nodes of Tree will
be preserved, but other attributes are lost.

Any node in Tree whose node type is
variable (cf. type/1), and whose list of
annotations (cf. get_ann/1) contains the atom
meta_var, will remain unchanged in the resulting tree,
except that exactly one occurrence of meta_var is
removed from its annotation list.

The main use of the function meta/1 is to transform a
data structure Tree, which represents a piece of program
code, into a form that is representation independent when
printed. E.g., suppose Tree represents a variable
named "V". Then (assuming a function print/1 for
printing syntax trees), evaluating print(abstract(Tree))
- simply using abstract/1 to map the actual data
structure onto a syntax tree representation - would output a string
that might look something like "{tree, variable, ..., "V",
...}", which is obviously dependent on the implementation of
the abstract syntax trees. This could e.g. be useful for caching a
syntax tree in a file. However, in some situations like in a program
generator generator (with two "generator"), it may be unacceptable.
Using print(meta(Tree)) instead would output a
representation independent syntax tree generating
expression; in the above case, something like
"erl_syntax:variable("V")".

operator(Name) -> syntaxTree()

Name = atom() | string()

Creates an abstract operator. The name of the operator is the
character sequence represented by Name. This is
analogous to the print name of an atom, but an operator is never
written within single-quotes; e.g., the result of
operator('++') represents "++" rather
than "'++'".

remove_comments(Node::syntaxTree()) -> syntaxTree()

revert(Tree::syntaxTree()) -> syntaxTree()

Returns an erl_parse-compatible representation of a
syntax tree, if possible. If Tree represents a
well-formed Erlang program or expression, the conversion should work
without problems. Typically, is_tree/1 yields
true if conversion failed (i.e., the result is still an
abstract syntax tree), and false otherwise.

The is_tree/1 test is not completely foolproof. For a
few special node types (e.g. arity_qualifier), if such a
node occurs in a context where it is not expected, it will be left
unchanged as a non-reverted subtree of the result. This can only
happen if Tree does not actually represent legal Erlang
code.

revert_forms(L::Forms) -> [erl_parse()]

Forms = syntaxTree() | [syntaxTree()]

Reverts a sequence of Erlang source code forms. The sequence can
be given either as a form_list syntax tree (possibly
nested), or as a list of "program form" syntax trees. If successful,
the corresponding flat list of erl_parse-compatible
syntax trees is returned (cf. revert/1). If some program
form could not be reverted, {error, Form} is thrown.
Standalone comments in the form sequence are discarded.

size_qualifier_body(Node::syntaxTree()) -> syntaxTree()

string(Value::string()) -> syntaxTree()

Creates an abstract string literal. The result represents
"Text" (including the surrounding
double-quotes), where Text corresponds to the sequence
of characters in Value, but not representing a
specific string literal. E.g., the result of
string("x\ny") represents any and all of
"x\ny", "x\12y", "x\012y" and
"x\^Jy"; cf. char/1.

string_value(Node::syntaxTree()) -> string()

subtrees(Node::syntaxTree()) -> [[syntaxTree()]]

Returns the grouped list of all subtrees of a syntax tree. If
Node is a leaf node (cf. is_leaf/1), this
is the empty list, otherwise the result is always a nonempty list,
containing the lists of subtrees of Node, in
left-to-right order as they occur in the printed program text, and
grouped by category. Often, each group contains only a single
subtree.

Depending on the type of Node, the size of some
groups may be variable (e.g., the group consisting of all the
elements of a tuple), while others always contain the same number of
elements - usually exactly one (e.g., the group containing the
argument expression of a case-expression). Note, however, that the
exact structure of the returned list (for a given node type) should
in general not be depended upon, since it might be subject to change
without notice.

The function subtrees/1 and the constructor functions
make_tree/2 and update_tree/2 can be a
great help if one wants to traverse a syntax tree, visiting all its
subtrees, but treat nodes of the tree in a uniform way in most or all
cases. Using these functions makes this simple, and also assures that
your code is not overly sensitive to extensions of the syntax tree
data type, because any node types not explicitly handled by your code
can be left to a default case.

the call postorder(fun f/1, Tree) will yield a new
representation of Tree in which all atom names have been
extended with the prefix "a_", but nothing else (including comments,
annotations and line numbers) has been changed.

text(String::string()) -> syntaxTree()

Creates an abstract piece of source code text. The result
represents exactly the sequence of characters in String.
This is useful in cases when one wants full control of the resulting
output, e.g., for the appearance of floating-point numbers or macro
definitions.

tree(Type) -> syntaxTree()

tree(Type::atom(), Data::term()) -> syntaxTree()

This function and the related is_tree/1 and
data/1 provide a uniform way to extend the set of
erl_parse node types. The associated data is any term,
whose format may depend on the type tag.

Notes:

Any nodes created outside of this module must have type tags
distinct from those currently defined by this module; see
type/1 for a complete list.

The type tag of a syntax tree node may also be used
as a primary tag by the erl_parse representation;
in that case, the selector functions for that node type
must handle both the abstract syntax tree and the
erl_parse form. The function type(T)
should return the correct type tag regardless of the
representation of T, so that the user sees no
difference between erl_syntax and
erl_parse nodes.

Creates an abstract try-expression. If Body is
[B1, ..., Bn], Clauses is [C1, ...,
Cj], Handlers is [H1, ..., Hk], and
After is [A1, ..., Am], the result
represents "try B1, ..., Bn of C1;
...; Cj catch H1; ...; Hk after
A1, ..., Am end". More exactly, if each
Ci represents "(CPi) CGi ->
CBi", and each Hi represents
"(HPi) HGi -> HBi", then the
result represents "try B1, ..., Bn of
CP1 CG1 -> CB1; ...; CPj
CGj -> CBj catch HP1 HG1 ->
HB1; ...; HPk HGk -> HBk after
A1, ..., Am end"; cf.
case_expr/2. If Clauses is the empty list,
the of ... section is left out. If After is
the empty list, the after ... section is left out. If
Handlers is the empty list, and After is
nonempty, the catch ... section is left out.

variable(Name) -> syntaxTree()

Creates an abstract variable with the given name.
Name may be any atom or string that represents a
lexically valid variable name, but not a single underscore
character; cf. underscore/0.

Note: no checking is done whether the character sequence
represents a proper variable name, i.e., whether or not its first
character is an uppercase Erlang character, or whether it does not
contain control characters, whitespace, etc.

variable_literal(Node::syntaxTree()) -> string()

variable_name(Node::syntaxTree()) -> atom()

warning_marker(Error::term()) -> syntaxTree()

Creates an abstract warning marker. The result represents an
occurrence of a possible problem in the source code, with an
associated Erlang I/O ErrorInfo structure given by Error
(see module io(3) for details). Warning markers are
regarded as source code forms, but have no defined lexical form.

Note: this is supported only for backwards compatibility with
existing parsers and tools.