SYNOPSIS

DESCRIPTION

In the magic mode of MicroEmacs '06, certain characters gain special meanings when used in a search pattern. Collectively they are know as regular expressions, and a limited number of them are supported in MicroEmacs '06. They grant greater flexibility when using the search commands (note that they also affect
isearch-forward(2) commands).

The symbols that have special meaning in magic mode are ^, $, ., \|, *, [], \(\), \{\} and \.

The characters ^ and $ fix the search pattern to the beginning and end of line, respectively. The ^ character must appear at the beginning of the search string, and the $ must appear at the end, otherwise they loose their meaning and are treated just like any other character. For example, in magic mode, searching for the pattern "t$" would put the cursor at the end of any line that ended with the letter 't'. Note that this is different than searching for "t<NL>", that is, 't' followed by a newline character. The character $ (and ^, for that matter) matches a position, not a character, so the cursor remains at the end of the line. But a newline is a character that must be matched, just like any other character, which means that the cursor is placed just after it - on the beginning of the next line.

The character '.' has a very simple meaning - it matches any single character, except the newline. Thus a search for "bad.er" could match "badger", "badder" (slang), or up to the 'r' of "bad error".

The character * is known as closure, and means that zero or more of the preceding character will match. If there is no character preceding, * has no special meaning, and since it will not match with a newline, * will have no special meaning if preceded by the beginning of line symbol ^ or the literal newline character <NL>. The notion of zero or more characters is important. If, for example, your cursor was on the line

This line is missing two vowels.

and a search was made for "a*", the cursor would not move, because it is guaranteed to match no letter 'a', which satisfies the search conditions. If you wanted to search for one or more of the letter 'a', you would search for "aa*", which would match the letter a, then zero or more of them, note that this pattern is better searched using "a+".

The character "+" is the same as "*" except that it searches for one or more occurrences of the preceding character.

The character [ indicates the beginning of a character class. It is similar to the any (.) character, but you get to choose which characters you want to match. The character class is ended with the character ]. So, while a search for "ba.e" will match "bane", "bade", "bale", "bate", et cetera, you can limit it to matching "babe" and "bake" by searching for "ba[bk]e". Only one of the characters inside the [ and ] will match a character. If in fact you want to match any character except those in the character class, you can put a ^ as the first character. It must be the first character of the class, or else it has no special meaning. So, a search for [^aeiou] will match any character except a vowel, but a search for [aeiou^] will match any vowel or a ^. If you have a lot of characters in order that you want to put in the character class, you may use a dash (-) as a range character. So, [a-z] will match any letter (or any lower case letter if exact mode is on), and [0-9a-f] will match any digit or any letter 'a' through 'f', which happen to be the characters for hexadecimal numbers. If the dash is at the beginning or end of a character class, it is taken to be just a dash.

The ? character provides a simple zero or one occurrence test of the previous character e.g. "ca?r" matches "cr" and "car", it will not match "caar".

Where a previous item has a range of repetitions then the \{N,M\} syntax may be used to denote the minimum and maximum iterations of the previous item. Where a set quantity of repetitions is required then the simpler syntax of \{N\} may be used. i.e. "ca\{2\}r" matches "caar", "ca\{2,3\}r" matches "caar" and "caaar".

The escape character \ is for those times when you want to be in magic mode, but also want to use a regular expression character to be just a character. It turns off the special meaning of the character. So a search for "it\." will search for a line with "it.", and not "it" followed by any other character. The escape character will also let you put ^, -, or ] inside a character class with no special side effects.

In
search-replace strings the \(\) pair may be used to group characters for in the search string for recall in the replacement string. The \(\) bracket pair is recalled using \1-\9 in the replace string, \1 is the first pair, \1 the second and so on. Hence to replace %dgdg%name%dhdh% with %dgdg%names%dhdh% then we could use the following search replace string \(%[a-z]+%\)\([a-z]*\)\(%[a-z]+%\) replacing with \1\2s\3.

\0 in the replace string implies the whole string.

A summary of magic mode special characters are defined as follows:-

^

Anchor search at beginning of line

$

Anchor search at end of line

.

Match any character except <NL>

*

Match zero or more occurrences of the preceding item.

\|

Match either/or i.e. car\|bike matches the work car and matches the word bike.

+

Match one or more occurrences of the preceding item.

?

Match zero or one occurrences of the preceeding item.

[]

Match a class of characters ([a-z] would be all alphabetics)

\

Take next literally

\{N,M\}

Match a minimum of N occurrences and maximum of M occurrences of the preceeding item.

\{N\}

Match a N occurrences of the preceeding item.

\(...\)

Delimit pattern to replicate in replace string. Max of 9 allowed. Called in replace string with \1,..,\9. 1 being 1st etc. \0 or \& in the replace string is the whole string. i.e.