Regular expressions

Compile a regular expression. The following constructs are
recognized:

. Matches any character except newline.

* (postfix) Matches the preceding expression zero, one or
several times

+ (postfix) Matches the preceding expression one or
several times

? (postfix) Matches the preceding expression once or
not at all

[..] Character set. Ranges are denoted with -, as in [a-z].
An initial ^, as in [^0-9], complements the set.
To include a ] character in a set, make it the first
character of the set. To include a - character in a set,
make it the first or the last character of the set.

^ Matches at beginning of line (either at the beginning of
the matched string, or just after a newline character).

$ Matches at end of line (either at the end of the matched
string, or just before a newline character).

\| (infix) Alternative between two expressions.

\(..\) Grouping and naming of the enclosed expression.

\1 The text matched by the first \(...\) expression
(\2 for the second expression, and so on up to \9).

search_forward r s start searches the string s for a substring
matching the regular expression r. The search starts at position
start and proceeds towards the end of the string.
Return the position of the first character of the matched
substring, or raise Not_found if no substring matches.

search_backward r s last searches the string s for a
substring matching the regular expression r. The search first
considers substrings that start at position last and proceeds
towards the beginning of string. Return the position of the first
character of the matched substring; raise Not_found if no
substring matches.

match_end() returns the position of the character following the
last character of the substring that was matched by string_match,
search_forward or search_backward.

valmatched_group : int -> string -> string

matched_group n s returns the substring of s that was matched
by the nth group \(...\) of the regular expression during
the latest Str.string_match, Str.search_forward or
Str.search_backward.
The user must make sure that the parameter s is the same string
that was passed to the matching or searching function.
matched_group n s raises Not_found if the nth group
of the regular expression was not matched. This can happen
with groups inside alternatives \|, options ?
or repetitions *. For instance, the empty string will match
\(a\)*, but matched_group 1 "" will raise Not_found
because the first group itself was not matched.

valgroup_beginning : int -> int

group_beginning n returns the position of the first character
of the substring that was matched by the nth group of
the regular expression.Raises

Not_found if the nth group of the regular expression
was not matched.

Invalid_argument if there are fewer than n groups in
the regular expression.

valgroup_end : int -> int

group_end n returns
the position of the character following the last character of
substring that was matched by the nth group of the regular expression.Raises

Not_found if the nth group of the regular expression
was not matched.

Invalid_argument if there are fewer than n groups in
the regular expression.

Replacement

global_replace regexp templ s returns a string identical to s,
except that all substrings of s that match regexp have been
replaced by templ. The replacement template templ can contain
\1, \2, etc; these sequences will be replaced by the text
matched by the corresponding group in the regular expression.
\0 stands for the text matched by the whole regular expression.

global_substitute regexp subst s returns a string identical
to s, except that all substrings of s that match regexp
have been replaced by the result of function subst. The
function subst is called once for each matching substring,
and receives s (the whole text) as argument.

Same as Str.global_substitute, except that only the first substring
matching the regular expression is replaced.

valreplace_matched : string -> string -> string

replace_matched repl s returns the replacement text repl
in which \1, \2, etc. have been replaced by the text
matched by the corresponding groups in the most recent matching
operation. s must be the same string that was matched during
this matching operation.

Splitting

split r s splits s into substrings, taking as delimiters
the substrings that match r, and returns the list of substrings.
For instance, split (regexp "[ \t]+") s splits s into
blank-separated words. An occurrence of the delimiter at the
beginning and at the end of the string is ignored.

Same as Str.split but occurrences of the
delimiter at the beginning and at the end of the string are
recognized and returned as empty strings in the result.
For instance, split_delim (regexp " ") " abc "
returns [""; "abc"; ""], while split with the same
arguments returns ["abc"].

Same as Str.split_delim, but returns
the delimiters as well as the substrings contained between
delimiters. The former are tagged Delim in the result list;
the latter are tagged Text. For instance,
full_split (regexp "[{}]") "{ab}" returns
[Delim"{"; Text"ab"; Delim"}"].