Regular expressions in Check_MK

Last updated: February 26. 2018

1. Introduction

Regular expressions – regexes for short – are used in Check_MK for
specifying service names, and they are used in many other functions as
well. They are character strings serving as templates that (match) or
(do not match) strings in specific texts. Regexes can be employed for many
practical tasks, for example, to formulate flexible rules that affect all
services whose names include foo or bar.

Regexes are often confused with search patterns for file names, because both
use the special characters * and ?. These so-called globbing
patterns however have a quite different syntax, and are not nearly as powerful
as the regular expressions. If you are uncertain whether a regular expression
is allowed in a particular situation, activate the online help for advice.

In this article we will explain the most important uses for regular expressions
– but by no means all of them. When the options shown here are insufficient
for your needs, for further reference below you can find
more comprehensive information. And of course there is always the internet.

1.1. Normal characters and the point

With regular expressions it is always a question of a template – the expression –
matching a specific text – e.g, a service name. A template can include a string
of special characters that have 'magic' significances. All normal characters in the
expression simply match themselves.

Check_MK does not distinguish between capital and non-capital letters.
The CPU load expression thus matches the text CPU load as well as
the text cpu LoAd. Note: for entry fields where – without regular
expressions – an exact match is required (mainly with host names), case sensitivity will
always be essential!

The most important special character is the . point.
It matches any single character:

Example:

Regular Expression

Match

Match

No match

Me.er

Meier

Meyer

Meyyer

.var.log

1var2log

/var/log

/var//log

1.2. Using a backslash to mask special characters

Since the point matches everything, it naturally follows that it also matches a point.
Should you wish to explicitly match a point, then the point must be masked by a
\ backslash (escape). This similarly applies to all other special characters,
as we shall see. These are: \ . * + ? { } ( ) [ ] | & ^ and $.

Regular Expression

Match

No match

No match

example\.com

example.com

example\.com

example-com

How\?

How?

How\?

How

C:\\Programs

C:\Programs

C:Programs

C:\\Programs

1.3. Repeating characters

One will very often want to define that any string of characters may appear
somewhere in an expression. In regexes this is coded with .* (point asterisk).
This is actually only a special case. The asterisk can represent any character,
which can appear any number of times in a search text. An empty sequence is also a valid sequence.
This means that .* matches any character string and that * matches
the preceeding character any number of times:

Regular Expression

Match

Match

No match

State.*OK

State is OK

State = OK

StatOK

State*OK

StateOK

StatOK

State OK

a *= *5

a=5

a = 5

a==5

The + is almost the same as *, but it allows no empty sequences.
The leading character must occur at least once:

Regular Expression

Match

Match

No match

State +OK

State OK

State OK

StateOK

switch +off

switch off

switch off

switchoff

Should you wish to restrict the number of repetitions, for this purpose there is a
syntax with braces with which a precise number or a range can be specified:

Regular Expression

Match

Match

No match

Ax{3}B

AxxxB

AxB

Ax{2,4}

Axx

Axxxx

Ax

A question mark is the abreviation for {0,1} – i.e. something that appears once, or never.
It thus designates the preceeding character as optional:

Regular Expression

Match

Match

No match

a-?b

ab

a-b

a--b

Meyi?er

Meyer

Meyier

Meyiier

1.4. Character classes, numerals and letters

Character classes allow situations such as 'a numeral must occur here'. To this end
set all permitted characters in square brackets. You can also enter ranges with a minus sign.
Note: The sequence in ASCII-character sets applies here.

For example, [abc] specifically stands for one of the letters a, b or c
and [0-9] for any character – both can be combined.
A negation for all of these is also possible:
Adding a ^ in the brackets thus allows [^abc] to stand for any
character except for a, b, c..

Character classes can of course be combined with other operations. Here are some
abstract examples:

Character class

Meaning

[abc]

Stands for exactly one of the letters a, b or c.

[0-9a-z_]

Exactly a numeral, a letter or an underscore.

[^abc]

Any character except for a, b, c.

[ --]

Exactly one character between blank characters and minus, in accordance with the ASCII-Table.

[0-9a-z]{1,20}

A designator with a maximum or 20 letters or numerals.

The following are a few practical examples:

Regular Expression

Match

Match

No match

[0-7]

0

5

9

[0-7]{2}

00

53

123

myhost_[0-9a-z_]{3}

myhost_1a3

myhost_1_5

myhost_1234

[+0-9/ --]+

+49 89 18904350

089 / 1890 435-0

089 : 1890 435-0

Note: If you need one or the other of the characters - or ->
you will need a trick.
Simply code - directly at the end of the class – as shown in the
preceeding example.
With this it will be clear to the regex interpreter that it can't be a sequence.
Code the square brackets as the first character in the class.
Since no empty classes are permitted it will be interpreted as a normal character.
A class with precisely these two characters will look like this: []-].

1.5. Beginning and end, prefix, suffix and infix

When comparing regular expressions with service names and other elements,
Check_MK always verifies that the text matches the beginning of the expression.
The reason is that this is what you usually need.
A rule in which for services the terms CPU and core are coded
thus applies to all services whose name begins with one of these terms:

This is described as a prefix match. Should you require an exact match,
this can be accomplished by appending a $.
This effectively matches the end of the text.
It is sufficient if the expression matches at any location in the text – a so-called
infix match. This is achieved in advance with the familiar .*:

Regular Expression

Match

Match

No match

/var

/var

/var/log

/test/var

/var$

/var

/var/log

.*/var$

/var

/test/var

/var/log

.*/var

/test/var

/test/var/log

\test\var\log

An exception to the rule that Check_MK always uses a prefix match is the
Event Console (EC), which always works with an infix match – so that
only containedness is checked. Here, by prefixing ^,
a match for the beginning can be forced – a prefix match in other words.

Regular Expression in EC

Match

Match

Kein Match

ORA-

ORACLEserver

myORACLEserver

myoracleserver

^ORA-

ORACLEserver

ORACLEhost

myORACLEserver

1.6. Alternatives

With a | vertical bar – an OR-link – you can define alternatives:
1|2|3 thus matches with 1, 2 or 3. If the alternatives are required in
the middle of an expression, enclose them in brackets '()'.

Regular Expression

Match

Match

No match

CPU load|core|memory

CPU load

core

CPU utilisation

01|02|1[1-5]

01

11 to 15

05

server\.(intern|dmz|123)\.net

server.intern.net

server.dmz.net

server.extern.net

1.7. Match groups

In the Event Console, in Business Intelligence (BI) and also in
Bulk renaming of hosts there is the possibilty of relating to
text components that are found in the original text.
For this patterns in regular expressions are marked with brackets.
The text component that matches the first bracketed expression will be available
in the substitution as \1, the second expression as \2, etc.

Regular Expression

Text

Group 1

Group 2

([a-z])+([123])+

abc123

abc

123

server-(.*)\.local

server-lnx02.local

lnx02

The image below shows such a rename. All host names that match the regular
expression server-(.*)\.local will be substituted with
\1.servers.local. In doing so the \1 represents the exact text
that will be 'captured' by the .* in the brackets:

In a concrete example, server-lnx02.local will be renamed to
lnx02.servers.local.

Groups can of course also be combined with the repetition operators
*, +, ? und {...}. Thus for example
the expression (/local)?/share matches /local/share, as well
as /share.

2. Table of all special characters

Here is a summary of all of the special characters as described above and the
functions performed by the regular expressions as used in Check_MK:

The following characters must be escaped with a backslash if they are to
be explicitly used: \ . * + ? { } ( ) [ ] | & ^ $

3. If you'd like to learn the full details

Back in the '60s, Ken Thompson, one of the inventors of UNIX, had already developed
the first regular expressions in their current form – including today's standard Unix
command grep. Since then countless extensions and dialects have been derived
from standard expressions – including extended regexes, Perl-compatible regexes and
a very similar variant in Python.

Under Filters in views Check_MK utilises POSIX extended regular
expressions (extended REs). These are analysed in the monitoring core using C with
the Regex function of the C-Bibliothek. A complete reference for this subject can be
found in the Linux-Manpage for regex(7):

In all other locations all of Python's other options for regular expressions
are additionally available. These apply to, among others, the Configurations rules,
the Event Console and Business Intelligence (BI). The Python-regexes are an
enhancement of the extended REs, and they are very similar to those from Perl.
They support, e.g., the so-called negative lookahead, a non-greedy asterisk *,
or a forced differentiation between upper and lower cases. The detailed options for these
regexes can be found in the Python online help for the re module: