HTML: ASCII Character Codes

The table below presents all ASCII characters with codes from 32-127 and 160-255.
The missing characters (0-27 and 128-159) are 'control characters' and
not normally suited for output in HTML. The 'Entity' column gives the
equivalent HTML character entity - where one exists.

If you're looking for a way to encode text - for protecting email links for example - you
can use this form to encode a short string:

For details on non-ASCII characters supported in HTML follow the link
under References below.

So why do we need all these different ways of referencing the same
characters? The Decimal values are rarely used, but
Octal codes turn up in various programming languages and the
Hex values in URL-encoded strings (%20 represents a space for
example). In HTML content the Symbol is used except where a proper
Character Entity is available.

For more information on encoding special characters see the related
article linked below.

Common Windows-1251 Character Codes

If your data has been corrupted with Windows-specific characters such
as: smart quotes, ellipses, dashes and non-breaking spaces, the
following list might be useful:

Decimal

Octal

Description

Plain Text Alternative

133

\205

ELLIPSIS

...

145

\221

HIGH 6 SINGLE QUOTE

'

146

\222

HIGH 9 SINGLE QUOTE

'

147

\223

HIGH 6 DOUBLE QUOTE

"

148

\224

HIGH 9 DOUBLE QUOTE

"

149

\225

LARGE CENTERED DOT

*

150

\226

EN DASH

-

151

\227

EM DASH

- or --

160

\240

NO-BREAK SPACE

(space)

Other replacement values are also possible including various valid
HTML entities (see above) or multibyte characters.

The reason for the \ in front of the Octal code for these
characters is so that we can use them in regular expressions as shown
here:

$output = mb_eregi_replace("\205", "...", $input);

You might find this necessary when converting data to UTF-8 or other
multibyte character formats. For non-multibyte formats you can use the
regular preg_replace or ereg_replace functions instead
as there's no danger of corrupting the text by accidentally replacing
one byte of a multibyte character.