3.3.3 Special characters

Text encoding

LilyPond uses the character repertoire defined by the Unicode
consortium and ISO/IEC 10646. This defines a unique name and
code point for the character sets used in virtually all modern
languages and many others too. Unicode can be implemented using
several different encodings. LilyPond uses the UTF-8 encoding
(UTF stands for Unicode Transformation Format) which represents
all common Latin characters in one byte, and represents other
characters using a variable length format of up to four bytes.

The actual appearance of the characters is determined by the
glyphs defined in the particular fonts available - a font defines
the mapping of a subset of the Unicode code points to glyphs.
LilyPond uses the Pango library to layout and render multi-lingual
texts.

LilyPond does not perform any input-encoding conversions. This
means that any text, be it title, lyric text, or musical
instruction containing non-ASCII characters, must be encoded in
UTF-8. The easiest way to enter such text is by using a
Unicode-aware editor and saving the file with UTF-8 encoding. Most
popular modern editors have UTF-8 support, for example, vim, Emacs,
jEdit, and GEdit do. All MS Windows systems later than NT use
Unicode as their native character encoding, so even Notepad can
edit and save a file in UTF-8 format. A more functional
alternative for Windows is BabelPad.

If a LilyPond input file containing a non-ASCII character is not
saved in UTF-8 format the error message

Unicode

To enter a single character for which the Unicode code point is
known but which is not available in the editor being used, use
either \char ##xhhhh or \char #dddd within a
\markup block, where hhhh is the hexadecimal code for
the character required and dddd is the corresponding decimal
value. Leading zeroes may be omitted, but it is usual to specify
all four characters in the hexadecimal representation. (Note that
the UTF-8 encoding of the code point should not be used
after \char, as UTF-8 encodings contain extra bits indicating
the number of octets.) Unicode code charts and a character name
index giving the code point in hexadecimal for any character can be
found on the Unicode Consortium website,
http://www.unicode.org/.

For example, \char ##x03BE and \char #958 would both
enter the Unicode U+03BE character, which has the Unicode name
“Greek Small Letter Xi”.

Any Unicode code point may be entered in this way and if all special
characters are entered in this format it is not necessary to save
the input file in UTF-8 format. Of course, a font containing all
such encoded characters must be installed and available to LilyPond.

The following example shows Unicode hexadecimal values being entered
in four places – in a rehearsal mark, as articulation text, in
lyrics and as stand-alone text below the score: