Description

html_entity_decode() is the opposite of
htmlentities() in that it converts HTML entities
in the string to their corresponding characters.

More precisely, this function decodes all the entities (including all numeric
entities) that a) are necessarily valid for the chosen document type — i.e.,
for XML, this function does not decode named entities that might be defined
in some DTD — and b) whose character or characters are in the coded character
set associated with the chosen encoding and are permitted in the chosen
document type. All other entities are left as is.

Parameters

string

The input string.

flags

A bitmask of one or more of the following flags, which specify how to handle quotes and
which document type to use. The default is ENT_COMPAT | ENT_HTML401.

Available flags constants

Constant Name

Description

ENT_COMPAT

Will convert double-quotes and leave single-quotes alone.

ENT_QUOTES

Will convert both double and single quotes.

ENT_NOQUOTES

Will leave both double and single quotes unconverted.

ENT_HTML401

Handle code as HTML 4.01.

ENT_XML1

Handle code as XML 1.

ENT_XHTML

Handle code as XHTML.

ENT_HTML5

Handle code as HTML 5.

encoding

An optional argument defining the encoding used when converting characters.

If omitted, the default value of the encoding varies
depending on the PHP version in use. In PHP 5.6 and later, the
default_charset configuration
option is used as the default value. PHP 5.4 and 5.5 will use
UTF-8 as the default. Earlier versions of PHP use
ISO-8859-1.

Although this argument is technically optional, you are highly encouraged to
specify the correct value for your code if you are using PHP 5.5 or earlier,
or if your default_charset
configuration option may be set incorrectly for the given input.

The following character sets are supported:

Supported charsets

Charset

Aliases

Description

ISO-8859-1

ISO8859-1

Western European, Latin-1.

ISO-8859-5

ISO8859-5

Little used cyrillic charset (Latin/Cyrillic).

ISO-8859-15

ISO8859-15

Western European, Latin-9. Adds the Euro sign, French and Finnish
letters missing in Latin-1 (ISO-8859-1).

Note:
Any other character sets are not recognized. The default encoding will be
used instead and a warning will be emitted.

Return Values

Returns the decoded string.

Changelog

Version

Description

5.6.0

The default value for the encoding parameter was
changed to be the value of the
default_charset configuration
option.

5.4.0

Default encoding changed from ISO-8859-1 to UTF-8.

5.4.0

The constants ENT_HTML401, ENT_XML1,
ENT_XHTML and ENT_HTML5 were added.

Examples

Example #1 Decoding HTML entities

<?php$orig = "I'll \"walk\" the <b>dog</b> now";

$a = htmlentities($orig);

$b = html_entity_decode($a);

echo $a; // I'll &quot;walk&quot; the &lt;b&gt;dog&lt;/b&gt; now

echo $b; // I'll "walk" the <b>dog</b> now?>

Notes

Note:

You might wonder why trim(html_entity_decode('&nbsp;')); doesn't
reduce the string to an empty string, that's because the '&nbsp;'
entity is not ASCII code 32 (which is stripped by
trim()) but ASCII code 160 (0xa0) in the default ISO
8859-1 encoding.

My understanding of the flag to use is the one that would correspond to the expected, converted outcome. So, ENT_QUOTES for a character that would be a single or double quote when converted... and so on.

I had a problem getting the 'TM' trademark symbol to display correctly in an email subject line. Using html_entity_decode() with different charsets didn't work, but directly replacing the entity with it's ASCII equivalent did:

I wrote in a previous comment that html_entity_decode() only handled about 100 characters. That's not quite true; it only handles entities that exist in the output character set (the third argument). If you want to get ALL HTML entities, make sure you use ENT_QUOTES and set the third argument to 'UTF-8'.

If you don't want a UTF-8 string, you'll need to convert it afterward with something like utf8_decode(), iconv(), or mb_convert_encoding().

If you're producing XML, which doesn't recognise most HTML entities:

When producing a UTF-8 document (the default), then htmlspecialchars(html_entity_decode($string, ENT_QUOTES, 'UTF-8'), ENT_NOQUOTES, 'UTF-8') (because you only need to escape < and > and & unless you're printing inside the XML tags themselves).

Otherwise, either convert all the named entities to numeric ones, or declare the named entities in the document's DTD. The full list of 252 entities can be found in the HTML 4.01 Spec, or you can cut and paste the function from my site (http://inanimatt.com/php-convert-entities.php).